fix(cysql): correct edge filtering, recursive depth bounds, SQL join predicate placement, fused edge index - BED-6499#45
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThis PR moves many node/edge predicates from WHERE into JOIN ONs, tightens recursive traversal bounds (<= → <), adds locality-aware constraint partitioning in the translator, enforces float8 casts for aggregate inputs, tweaks traversal/projection logic, updates tests/assertions, fixes a triplestore edge check, and bumps Go/toolchain deps. Changes
Sequence Diagram(s)(omitted — changes are internal translation/refactorings without a new multi-component sequential flow warranting a diagram) Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
container/triplestore.go (1)
282-290:⚠️ Potential issue | 🔴 CriticalBug:
EachAdjacentEdgehas the same duplicatenext.Startcheck that was fixed inEachEdge.Line 284 checks
next.Starttwice instead of checking bothnext.Startandnext.End. This will incorrectly include edges whose end-node has been deleted.🐛 Proposed fix
func (s *triplestoreProjection) EachAdjacentEdge(node uint64, direction graph.Direction, delegate func(next Edge) bool) { s.origin.EachAdjacentEdge(node, direction, func(next Edge) bool { - if !s.deletedEdges.Contains(next.ID) && !s.deletedNodes.Contains(next.Start) && !s.deletedNodes.Contains(next.Start) { + if !s.deletedEdges.Contains(next.ID) && !s.deletedNodes.Contains(next.Start) && !s.deletedNodes.Contains(next.End) { return delegate(next) } return true }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@container/triplestore.go` around lines 282 - 290, In EachAdjacentEdge, the condition mistakenly checks next.Start twice so edges with a deleted end node slip through; update the predicate inside triplestoreProjection.EachAdjacentEdge to check both s.deletedNodes.Contains(next.Start) and s.deletedNodes.Contains(next.End) (and still check s.deletedEdges.Contains(next.ID)), then call delegate(next) only when none are deleted; use the same pattern as the corrected EachEdge and keep the s.origin.EachAdjacentEdge(...) wrapper and delegate return semantics unchanged.cypher/models/pgsql/translate/traversal.go (1)
43-60:⚠️ Potential issue | 🟠 MajorParenthesize the directionless endpoint predicate before conjoining local filters.
For
graph.DirectionBoth,traversalStep.LeftNodeJoinConditionandtraversalStep.RightNodeJoinConditionareORexpressions. Feeding them straight intoOptionalAnd(...)produces SQL likename = 'n3' AND n1.id = e0.end_id OR n1.id = e0.start_id, so the second branch bypasses the local filter. The generated SQL incypher/models/pgsql/test/translation_cases/nodes.sql(lines 189–208) already shows this regression.🐛 Proposed fix
Joins: []pgsql.Join{{ Table: pgsql.TableReference{ Name: pgsql.CompoundIdentifier{pgsql.TableNode}, Binding: models.OptionalValue(traversalStep.LeftNode.Identifier), }, JoinOperator: pgsql.JoinOperator{ JoinType: pgsql.JoinTypeInner, - Constraint: pgsql.OptionalAnd(leftJoinLocal, traversalStep.LeftNodeJoinCondition), + Constraint: pgsql.OptionalAnd(leftJoinLocal, pgsql.NewParenthetical(traversalStep.LeftNodeJoinCondition)), }, }, { Table: pgsql.TableReference{ Name: pgsql.CompoundIdentifier{pgsql.TableNode}, Binding: models.OptionalValue(traversalStep.RightNode.Identifier), }, JoinOperator: pgsql.JoinOperator{ JoinType: pgsql.JoinTypeInner, - Constraint: pgsql.OptionalAnd(rightJoinLocal, traversalStep.RightNodeJoinCondition), + Constraint: pgsql.OptionalAnd(rightJoinLocal, pgsql.NewParenthetical(traversalStep.RightNodeJoinCondition)), }, }},🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/translate/traversal.go` around lines 43 - 60, The bug is that when traversalStep.LeftNodeJoinCondition or traversalStep.RightNodeJoinCondition is an OR (e.g., for graph.DirectionBoth) it must be parenthesized before conjoining with local filters; update the JoinOperator.Constraint construction to call OptionalAnd(leftJoinLocal, pgsql.ParenthesizedExpr(traversalStep.LeftNodeJoinCondition)) and OptionalAnd(rightJoinLocal, pgsql.ParenthesizedExpr(traversalStep.RightNodeJoinCondition)) (or the equivalent parenthesizing helper in pgsql) so the OR branch does not bypass the local ANDed filter.
🧹 Nitpick comments (1)
cypher/models/pgsql/translate/model.go (1)
124-130: BroadenflattenConjunctionbeyond pointer-formANDnodes.Right now only
*pgsql.BinaryExpressionis flattened. If a conjunction reaches this helper as a valuepgsql.BinaryExpressionor under aParenthetical, the whole term stays intact and join-local predicates can still get stranded inWHERE.♻️ Proposed hardening
func flattenConjunction(expr pgsql.Expression) []pgsql.Expression { - if bin, typeOK := expr.(*pgsql.BinaryExpression); !typeOK || bin.Operator != pgsql.OperatorAnd { - return []pgsql.Expression{expr} - } else { - return append(flattenConjunction(bin.LOperand), flattenConjunction(bin.ROperand)...) - } + switch typed := expr.(type) { + case *pgsql.Parenthetical: + return flattenConjunction(typed.Expression) + case *pgsql.BinaryExpression: + if typed.Operator == pgsql.OperatorAnd { + return append(flattenConjunction(typed.LOperand), flattenConjunction(typed.ROperand)...) + } + case pgsql.BinaryExpression: + if typed.Operator == pgsql.OperatorAnd { + return append(flattenConjunction(typed.LOperand), flattenConjunction(typed.ROperand)...) + } + } + + return []pgsql.Expression{expr} }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/translate/model.go` around lines 124 - 130, flattenConjunction only handles pointer-form *pgsql.BinaryExpression and ignores non-pointer pgsql.BinaryExpression and Parenthetical wrappers, so update flattenConjunction to recognize both pointer and non-pointer binary nodes and to unwrap Parenthetical expressions before checking the operator; e.g., normalize the expr by stripping Parenthetical layers, then use a type switch to detect either *pgsql.BinaryExpression or pgsql.BinaryExpression with Operator == pgsql.OperatorAnd and recursively flatten bin.LOperand and bin.ROperand, returning the single expr when it’s not an AND binary node.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cypher/models/pgsql/test/translation_cases/multipart.sql`:
- Around line 27-33: The file contains a duplicated test case: two identical
Cypher->SQL blocks starting with the comment "-- case: match
(n:NodeKind1)-[:EdgeKind1*1..]->(:NodeKind2)-[:EdgeKind2]->(m:NodeKind1) ..."
and the CTE chain beginning with with s0 as (with s1 as (with recursive
s2(...))); remove one of these duplicate blocks (keep a single instance) —
locate the duplicate by the unique comment text and the CTE root symbol s0/s1/s2
and delete the redundant entire block including its trailing select s0.n2 as m
from s0 where (s0.i0 >= 10);
- Around line 76-99: Remove the duplicate SQL test blocks that repeat earlier
cases: delete the repeated case starting "match
(n:NodeKind1)<-[:EdgeKind1]-(:NodeKind2) where n.objectid ends with '-516'...",
the repeated "match (n:NodeKind1)-[:EdgeKind1]->(m:NodeKind2) where n.enabled =
true..." block, the two repeated "with \"a\" as check, \"b\" as ref ...
collect(tolower(g.samaccountname)) as refmembership, tolower(u.samaccountname)
as samname" blocks (both the variant matching EdgeKind2 with '-' and the '->'
variant), the repeated recursive path case "match p
=(n:NodeKind1)<-[r:EdgeKind1|EdgeKind2*..3]-(u:NodeKind1) ...", the repeated
"match (u:NodeKind1)-[:EdgeKind1]->(g:NodeKind2) with g match
(g)<-[:EdgeKind1]-(u:NodeKind1) return g" block, and the repeated "match
(cg:NodeKind1) where cg.name =~ ... collect (cg.email) as emails ..." block;
keep the first occurrence of each unique case (the initial s0/s1 CTE variants)
and remove subsequent duplicate CTE groups so each described case appears only
once.
In `@cypher/models/pgsql/test/translation_cases/pattern_binding.sql`:
- Around line 71-75: Two duplicate test cases are present: the blocks starting
with "-- case: match p =
(:NodeKind1)-[:EdgeKind1|EdgeKind2]->(e:NodeKind2)-[:EdgeKind2]->(:NodeKind1)
where 'a' in e.values ..." and "-- case: match p =
(n:NodeKind1)-[r]-(m:NodeKind1) return p" (which use s0/s1, e0/e1, n0/n1 and
edges_to_path) are exact copies of earlier cases and should be removed; delete
these two trailing case blocks so each scenario appears only once in the fixture
and leave the original versions earlier in the file intact.
- Line 54: The undirected endpoint OR is evaluated with wrong precedence so the
kind filter only applies to one branch; wrap the OR together and then AND the
kind check so the kind constraint applies to whichever endpoint matches.
Concretely, in the s0 subquery's JOINs for n0 and n1 (references: s0, e0, n0,
n1) change the ON clauses from "n0.kind_ids ... and n0.id = e0.end_id or n0.id =
e0.start_id" to "((n0.id = e0.end_id or n0.id = e0.start_id) and n0.kind_ids
...)" and do the analogous change for n1 so the NodeKind1 predicate is grouped
with the endpoint match.
- Line 42: The final path concatenation currently prepends the last edge by
using [(s2.e1).id] || s2.ep0; change it so the recursive path ep0 comes first
and then the last edge is appended: build the array from s0.ep0 (s1.path)
followed by the single-element array for s2.e1 before calling edges_to_path,
i.e. swap the concatenation order around the edges_to_path call that produces p
so the result is ep0 || array[(s2.e1).id], preserving forward order from s to i
to final node.
In `@cypher/models/pgsql/test/translation_cases/quantifiers.sql`:
- Around line 41-45: Remove the duplicated quantifier test blocks that repeat
the two cases beginning with the comments "-- case: MATCH (m:NodeKind1) WHERE
m.unconstraineddelegation = true ..." and "-- case: MATCH (m:NodeKind1) WHERE
ANY(name in m.serviceprincipalnames WHERE name CONTAINS "PHANTOM") ..."; locate
the repeated WITH ... select ... queries that build s0/s2 (and s0/s2 with arrays
i0/i1) and delete the second occurrence so each case appears only once,
preserving the first instance that defines s0/s2 and their final SELECT
returning m.
In `@cypher/models/pgsql/test/translation_cases/update.sql`:
- Around line 53-54: Remove the duplicate SQL test case block that begins with
the comment "-- case: match (a)-[r]->(:NodeKind1) set a.name = '123',
r.is_special_outbound = true'": the loader in testcase.go appends every "--
case:" block it finds, so keep only the first occurrence and delete the second
identical copy to avoid creating a duplicate test entry.
In `@cypher/models/pgsql/test/validation_integration_test.go`:
- Around line 68-71: Rename the misspelled variable cassesPassed to casesPassed
in the var block alongside casesRun and update every usage/reference to that
identifier in the test (e.g., the increments/assertions where cassesPassed is
read/updated); ensure the declaration reads "casesPassed" and all places that
previously referenced cassesPassed now reference casesPassed so the test
compiles and tracks passed cases correctly.
In `@drivers/pg/query/sql/schema_up.sql`:
- Around line 191-194: Add a new timestamped SQL migration that creates the two
missing indexes for existing databases: create the edge_start_kind_index on edge
(start_id, kind_id) and create the edge_end_kind_index on edge (end_id,
kind_id); ensure the migration uses the same "create index if not exists"
semantics as in schema_up.sql so it is idempotent and will be applied by
incremental migration runners, and name the file with a proper timestamp prefix
so it runs in order alongside other migrations.
---
Outside diff comments:
In `@container/triplestore.go`:
- Around line 282-290: In EachAdjacentEdge, the condition mistakenly checks
next.Start twice so edges with a deleted end node slip through; update the
predicate inside triplestoreProjection.EachAdjacentEdge to check both
s.deletedNodes.Contains(next.Start) and s.deletedNodes.Contains(next.End) (and
still check s.deletedEdges.Contains(next.ID)), then call delegate(next) only
when none are deleted; use the same pattern as the corrected EachEdge and keep
the s.origin.EachAdjacentEdge(...) wrapper and delegate return semantics
unchanged.
In `@cypher/models/pgsql/translate/traversal.go`:
- Around line 43-60: The bug is that when traversalStep.LeftNodeJoinCondition or
traversalStep.RightNodeJoinCondition is an OR (e.g., for graph.DirectionBoth) it
must be parenthesized before conjoining with local filters; update the
JoinOperator.Constraint construction to call OptionalAnd(leftJoinLocal,
pgsql.ParenthesizedExpr(traversalStep.LeftNodeJoinCondition)) and
OptionalAnd(rightJoinLocal,
pgsql.ParenthesizedExpr(traversalStep.RightNodeJoinCondition)) (or the
equivalent parenthesizing helper in pgsql) so the OR branch does not bypass the
local ANDed filter.
---
Nitpick comments:
In `@cypher/models/pgsql/translate/model.go`:
- Around line 124-130: flattenConjunction only handles pointer-form
*pgsql.BinaryExpression and ignores non-pointer pgsql.BinaryExpression and
Parenthetical wrappers, so update flattenConjunction to recognize both pointer
and non-pointer binary nodes and to unwrap Parenthetical expressions before
checking the operator; e.g., normalize the expr by stripping Parenthetical
layers, then use a type switch to detect either *pgsql.BinaryExpression or
pgsql.BinaryExpression with Operator == pgsql.OperatorAnd and recursively
flatten bin.LOperand and bin.ROperand, returning the single expr when it’s not
an AND binary node.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 522ca622-5fe3-419b-90c6-5b6f5d23b4fa
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (21)
container/triplestore.gocypher/models/pgsql/test/testcase.gocypher/models/pgsql/test/translation_cases/multipart.sqlcypher/models/pgsql/test/translation_cases/nodes.sqlcypher/models/pgsql/test/translation_cases/pattern_binding.sqlcypher/models/pgsql/test/translation_cases/pattern_expansion.sqlcypher/models/pgsql/test/translation_cases/quantifiers.sqlcypher/models/pgsql/test/translation_cases/scalar_aggregation.sqlcypher/models/pgsql/test/translation_cases/stepwise_traversal.sqlcypher/models/pgsql/test/translation_cases/update.sqlcypher/models/pgsql/test/validation_integration_test.gocypher/models/pgsql/translate/constraints.gocypher/models/pgsql/translate/expansion.gocypher/models/pgsql/translate/expression.gocypher/models/pgsql/translate/function.gocypher/models/pgsql/translate/model.gocypher/models/pgsql/translate/predicate.gocypher/models/pgsql/translate/traversal.gocypher/models/walk/walk_pgsql.godrivers/pg/query/sql/schema_up.sqlgo.mod
66d540c to
44ddf42
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cypher/models/pgsql/translate/function.go (1)
309-334:⚠️ Potential issue | 🟠 MajorRemove
float8casting frommin/maxarguments.Per Neo4j's official documentation,
min()andmax()accept ANY value type—not just numeric values. They support strings, temporal values (DATE, DATETIME, TIME, DURATION), and other comparable types. Casting arguments tofloat8violates Cypher semantics and will cause runtime failures for queries likemin(n.name)ormin(n.created_date). The test cases already reflect this regression by forcing temporal values through numeric casting.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/translate/function.go` around lines 309 - 334, The MinFunction and MaxFunction handlers are incorrectly casting their single argument to pgsql.Float8 via TypeCastExpression, which breaks non-numeric inputs; update the cypher.MinFunction and cypher.MaxFunction cases to stop calling TypeCastExpression/using pgsql.Float8 and instead push the original popped operand as the sole parameter in the pgsql.FunctionCall (i.e., after the NumArguments check and PopOperand call, use the popped argument directly in treeTranslator.PushOperand with Function: pgsql.FunctionMin / pgsql.FunctionMax and Parameters: []pgsql.Expression{argument}); keep existing error handling and name checks intact.
♻️ Duplicate comments (2)
cypher/models/pgsql/test/translation_cases/multipart.sql (1)
32-33:⚠️ Potential issue | 🟡 MinorRemove the duplicated
count(distinct(n))case.This block is a byte-for-byte copy of the case at Lines 29-30, so it adds no coverage and makes the fixture harder to maintain.
Verify by locating both identical
-- case:headers; expected: this case should appear once.#!/bin/bash python - <<'PY' from pathlib import Path lines = Path("cypher/models/pgsql/test/translation_cases/multipart.sql").read_text().splitlines() needle = "-- case: match (n:NodeKind1)-[:EdgeKind1*1..]->(:NodeKind2)-[:EdgeKind2]->(m:NodeKind1) where (n:NodeKind1 or n:NodeKind2) and n.enabled = true with m, count(distinct(n)) as p where p >= 10 return m" for idx, line in enumerate(lines, 1): if line == needle: print(idx) PY🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/test/translation_cases/multipart.sql` around lines 32 - 33, Remove the duplicated test case block that starts with the comment header "-- case: match (n:NodeKind1)-[:EdgeKind1*1..]->(:NodeKind2)-[:EdgeKind2]->(m:NodeKind1) where (n:NodeKind1 or n:NodeKind2) and n.enabled = true with m, count(distinct(n)) as p where p >= 10 return m"; specifically delete the second byte-for-byte copy of that case (the entire WITH ... SELECT ... query that builds s0/s1/s2/s3 and returns s0.n2 as m) so only a single instance of the case remains, then run the provided verification script or search for the identical "-- case:" header to confirm there is exactly one occurrence.cypher/models/pgsql/test/translation_cases/pattern_binding.sql (1)
54-54:⚠️ Potential issue | 🟠 MajorParenthesize the
n1endpoint OR in the undirected case.
n0was fixed, butn1still parses as(NodeKind1 AND end_id) OR start_id. That lets thestart_idbranch bind a non-NodeKind1node, so the expected SQL still weakensmatch p = (n:NodeKind1)-[r]-(m:NodeKind1).🛠️ Suggested fix
-n1.kind_ids operator (pg_catalog.@>) array [1]::int2[] and n1.id = e0.end_id or n1.id = e0.start_id +n1.kind_ids operator (pg_catalog.@>) array [1]::int2[] and (n1.id = e0.end_id or n1.id = e0.start_id)Verify by inspecting the
n1join; expected: both endpoint branches should be grouped under the sameNodeKind1predicate.#!/bin/bash sed -n '53,54p' cypher/models/pgsql/test/translation_cases/pattern_binding.sql rg -nP 'join node n1 on .*n1\.kind_ids.* and n1\.id = e0\.end_id or n1\.id = e0\.start_id' cypher/models/pgsql/test/translation_cases/pattern_binding.sql🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/test/translation_cases/pattern_binding.sql` at line 54, The join for n1 is missing parentheses so the OR binds only to the id checks instead of grouping both endpoint id branches under the NodeKind1 predicate; locate the "join node n1 on n1.kind_ids operator (pg_catalog.@>) array [1]::int2[] and n1.id = e0.end_id or n1.id = e0.start_id" expression in pattern_binding.sql and wrap the endpoint OR with parentheses so the condition reads like "join node n1 on n1.kind_ids operator (pg_catalog.@>) array [1]::int2[] and (n1.id = e0.end_id or n1.id = e0.start_id)" (i.e., ensure the kind_ids check and both id branches are grouped together).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@container/triplestore.go`:
- Around line 272-275: The projection filters are inconsistent: update
EachAdjacentEdge to mirror EachEdge by skipping edges where
s.deletedEdges.Contains(next.ID) or either endpoint is deleted (use
s.deletedNodes.Contains(next.Start) and s.deletedNodes.Contains(next.End)
instead of checking Start twice), update NumEdges to count only edges that are
not in s.deletedEdges and whose Start and End nodes are not in s.deletedNodes
(e.g., iterate s.origin.EachEdge or filter the stored count accordingly), and
ensure EachAdjacentNode (and any other edge-returning APIs) applies the same
deletedEdges/deletedNodes checks so hidden edges cannot be observed via
adjacency helpers; reference s.origin, s.deletedEdges and s.deletedNodes in each
fix.
In `@cypher/models/pgsql/test/translation_cases/multipart.sql`:
- Line 63: The n3 JOIN ON condition in CTE s3 is missing parentheses so the AND
binds tighter than the OR; update the n3 join (the one referencing n3.kind_ids
and n3.id with e1.end_id/e1.start_id) to parenthesize the end_id branch: wrap
"(n3.kind_ids operator (pg_catalog.@>) array [1]::int2[] and n3.id = e1.end_id)"
and then OR n3.id = e1.start_id, mirroring how n2's ON clause is parenthesized
so the NodeKind1 predicate applies to both endpoint branches of e1.
- Around line 29-30: The aggregate in the s3 -> s0 projection is missing
DISTINCT and overcounts; update the select that defines i0 (currently
"count((s3.n0))::int8 as i0") to use a distinct count of the node composite
(e.g., count(DISTINCT (s3.n0))::int8 as i0) so it matches the Cypher semantics
count(distinct(n)) — locate the select in the CTE that projects "select s3.n2 as
n2, count((s3.n0))::int8 as i0 from s3 group by n2" and change only the
aggregate to use DISTINCT.
In `@cypher/models/pgsql/translate/constraints.go`:
- Around line 29-43: The terminalNodeConstraint function returns an OR-based
binary expression that is not wrapped in pgsql.NewParenthetical, causing
operator-precedence bugs when combined with OptionalAnd; update
terminalNodeConstraint to wrap the OR expression in pgsql.NewParenthetical (same
as the other constraint functions handling DirectionBoth) so the combined
predicate preserves intended grouping — locate terminalNodeConstraint and ensure
the compound pgsql.NewBinaryExpression( ... OperatorOr ... ) is passed into
pgsql.NewParenthetical before returning.
---
Outside diff comments:
In `@cypher/models/pgsql/translate/function.go`:
- Around line 309-334: The MinFunction and MaxFunction handlers are incorrectly
casting their single argument to pgsql.Float8 via TypeCastExpression, which
breaks non-numeric inputs; update the cypher.MinFunction and cypher.MaxFunction
cases to stop calling TypeCastExpression/using pgsql.Float8 and instead push the
original popped operand as the sole parameter in the pgsql.FunctionCall (i.e.,
after the NumArguments check and PopOperand call, use the popped argument
directly in treeTranslator.PushOperand with Function: pgsql.FunctionMin /
pgsql.FunctionMax and Parameters: []pgsql.Expression{argument}); keep existing
error handling and name checks intact.
---
Duplicate comments:
In `@cypher/models/pgsql/test/translation_cases/multipart.sql`:
- Around line 32-33: Remove the duplicated test case block that starts with the
comment header "-- case: match
(n:NodeKind1)-[:EdgeKind1*1..]->(:NodeKind2)-[:EdgeKind2]->(m:NodeKind1) where
(n:NodeKind1 or n:NodeKind2) and n.enabled = true with m, count(distinct(n)) as
p where p >= 10 return m"; specifically delete the second byte-for-byte copy of
that case (the entire WITH ... SELECT ... query that builds s0/s1/s2/s3 and
returns s0.n2 as m) so only a single instance of the case remains, then run the
provided verification script or search for the identical "-- case:" header to
confirm there is exactly one occurrence.
In `@cypher/models/pgsql/test/translation_cases/pattern_binding.sql`:
- Line 54: The join for n1 is missing parentheses so the OR binds only to the id
checks instead of grouping both endpoint id branches under the NodeKind1
predicate; locate the "join node n1 on n1.kind_ids operator (pg_catalog.@>)
array [1]::int2[] and n1.id = e0.end_id or n1.id = e0.start_id" expression in
pattern_binding.sql and wrap the endpoint OR with parentheses so the condition
reads like "join node n1 on n1.kind_ids operator (pg_catalog.@>) array
[1]::int2[] and (n1.id = e0.end_id or n1.id = e0.start_id)" (i.e., ensure the
kind_ids check and both id branches are grouped together).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 50f5e6d0-2550-4c3b-a239-0f3fe8aab167
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (22)
container/triplestore.gocypher/models/pgsql/test/testcase.gocypher/models/pgsql/test/translation_cases/multipart.sqlcypher/models/pgsql/test/translation_cases/nodes.sqlcypher/models/pgsql/test/translation_cases/pattern_binding.sqlcypher/models/pgsql/test/translation_cases/pattern_expansion.sqlcypher/models/pgsql/test/translation_cases/quantifiers.sqlcypher/models/pgsql/test/translation_cases/scalar_aggregation.sqlcypher/models/pgsql/test/translation_cases/stepwise_traversal.sqlcypher/models/pgsql/test/translation_cases/update.sqlcypher/models/pgsql/test/validation_integration_test.gocypher/models/pgsql/translate/constraints.gocypher/models/pgsql/translate/expansion.gocypher/models/pgsql/translate/expression.gocypher/models/pgsql/translate/function.gocypher/models/pgsql/translate/model.gocypher/models/pgsql/translate/predicate.gocypher/models/pgsql/translate/projection.gocypher/models/pgsql/translate/traversal.gocypher/models/walk/walk_pgsql.godrivers/pg/query/sql/schema_up.sqlgo.mod
✅ Files skipped from review due to trivial changes (6)
- cypher/models/pgsql/translate/expansion.go
- cypher/models/pgsql/test/testcase.go
- drivers/pg/query/sql/schema_up.sql
- cypher/models/pgsql/translate/model.go
- cypher/models/pgsql/test/translation_cases/update.sql
- cypher/models/pgsql/test/translation_cases/pattern_expansion.sql
🚧 Files skipped from review as they are similar to previous changes (4)
- cypher/models/pgsql/translate/predicate.go
- cypher/models/walk/walk_pgsql.go
- go.mod
- cypher/models/pgsql/test/translation_cases/quantifiers.sql
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (4)
container/triplestore.go (1)
282-290:⚠️ Potential issue | 🟠 Major
EachAdjacentEdgestill checksnext.Starttwice.This method has the same bug that was just fixed in
EachEdge— line 284 checksnext.Starttwice instead of checking bothnext.Startandnext.End.func (s *triplestoreProjection) EachAdjacentEdge(node uint64, direction graph.Direction, delegate func(next Edge) bool) { s.origin.EachAdjacentEdge(node, direction, func(next Edge) bool { - if !s.deletedEdges.Contains(next.ID) && !s.deletedNodes.Contains(next.Start) && !s.deletedNodes.Contains(next.Start) { + if !s.deletedEdges.Contains(next.ID) && !s.deletedNodes.Contains(next.Start) && !s.deletedNodes.Contains(next.End) { return delegate(next) } return true }) },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@container/triplestore.go` around lines 282 - 290, In EachAdjacentEdge on triplestoreProjection you’re mistakenly checking next.Start twice; update the deletedNodes checks so the second check uses next.End (i.e., ensure the condition is !s.deletedEdges.Contains(next.ID) && !s.deletedNodes.Contains(next.Start) && !s.deletedNodes.Contains(next.End)) so edges adjacent to deleted end nodes are filtered correctly; modify the anonymous func passed to s.origin.EachAdjacentEdge accordingly.cypher/models/pgsql/translate/constraints.go (1)
29-43:⚠️ Potential issue | 🔴 CriticalWrap
terminalNodeConstraint'sDirectionBothpredicate inpgsql.NewParenthetical(...).This is still the only traversal-side
A OR Bpredicate returned without grouping. OncerightNodeTraversalStepJoinConditionis combined with other right-node filters viapgsql.OptionalAnd(...), thestart_idarm can bypass those filters; the updated negative-pattern expectations innodes.sqlalready show that broken... AND ... OR ...shape.🔧 Suggested fix
case graph.DirectionBoth: - return pgsql.NewBinaryExpression( - pgsql.NewBinaryExpression( - pgsql.CompoundIdentifier{nodeIdentifier, pgsql.ColumnID}, - pgsql.OperatorEquals, - pgsql.CompoundIdentifier{edgeIdentifier, pgsql.ColumnEndID}, - ), - pgsql.OperatorOr, - pgsql.NewBinaryExpression( - pgsql.CompoundIdentifier{nodeIdentifier, pgsql.ColumnID}, - pgsql.OperatorEquals, - pgsql.CompoundIdentifier{edgeIdentifier, pgsql.ColumnStartID}, - ), - ), nil + return pgsql.NewParenthetical( + pgsql.NewBinaryExpression( + pgsql.NewBinaryExpression( + pgsql.CompoundIdentifier{nodeIdentifier, pgsql.ColumnID}, + pgsql.OperatorEquals, + pgsql.CompoundIdentifier{edgeIdentifier, pgsql.ColumnEndID}, + ), + pgsql.OperatorOr, + pgsql.NewBinaryExpression( + pgsql.CompoundIdentifier{nodeIdentifier, pgsql.ColumnID}, + pgsql.OperatorEquals, + pgsql.CompoundIdentifier{edgeIdentifier, pgsql.ColumnStartID}, + ), + ), + ), nilAlso applies to: 76-90, 152-165
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/translate/constraints.go` around lines 29 - 43, terminalNodeConstraint returns a DirectionBoth predicate composed as "A OR B" without grouping, which can escape surrounding AND filters when combined (see rightNodeTraversalStepJoinCondition and pgsql.OptionalAnd); wrap the DirectionBoth predicate in pgsql.NewParenthetical(...) wherever it is built (e.g., in terminalNodeConstraint and the similar constructions at the other locations mentioned) so the OR expression is parenthesized and cannot bypass adjacent filters.cypher/models/pgsql/test/translation_cases/multipart.sql (2)
29-30:⚠️ Potential issue | 🟠 MajorPreserve the
DISTINCTin this aggregate.Line 30 still uses plain
count((s3.n0)), so the samenis overcounted when it reaches the samemthrough multiple paths. That does not match thecount(distinct(n))case on Line 29.Suggested fix
-select s3.n2 as n2, count((s3.n0))::int8 as i0 from s3 group by n2 +select s3.n2 as n2, count(distinct (s3.n0))::int8 as i0 from s3 group by n2In PostgreSQL, does `count(expr)` count duplicate non-null values, while `count(DISTINCT expr)` removes duplicates?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/test/translation_cases/multipart.sql` around lines 29 - 30, The COUNT aggregate lost its DISTINCT qualifier causing duplicate n values to be overcounted; update the aggregate in the final CTE from count((s3.n0)) (in the SELECT inside CTE s0 / s3 -> producing i0) to count(DISTINCT (s3.n0)) so it matches the original intent count(distinct(n)) — locate the COUNT expression producing i0 (count((s3.n0))::int8 as i0 in the s3 -> s0 selection) and add DISTINCT to the count.
59-60:⚠️ Potential issue | 🟠 MajorParenthesize the
n3endpoint predicate.Line 60 still parses as
(n3.kind_ids @> ... and n3.id = e1.end_id) or n3.id = e1.start_id, so thestart_idbranch can bind any node kind. That weakens theg:NodeKind1constraint for the undirected match.Suggested fix
-join node n3 on n3.kind_ids operator (pg_catalog.@>) array [1]::int2[] and n3.id = e1.end_id or n3.id = e1.start_id +join node n3 on n3.kind_ids operator (pg_catalog.@>) array [1]::int2[] and (n3.id = e1.end_id or n3.id = e1.start_id)In PostgreSQL SQL operator precedence, does `AND` bind tighter than `OR` inside a `JOIN ... ON` predicate?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/test/translation_cases/multipart.sql` around lines 59 - 60, The ON predicate for the join with node n3 is missing parentheses so the expression is parsed as (n3.kind_ids @> ... AND n3.id = e1.end_id) OR n3.id = e1.start_id, weakening the g:NodeKind1 constraint; fix by parenthesizing the id check so the kind_ids requirement applies to both endpoints—e.g. wrap the OR as (n3.id = e1.end_id OR n3.id = e1.start_id) or wrap the whole right-hand side so the join clause for n3 reads n3.kind_ids operator (...) AND (n3.id = e1.end_id OR n3.id = e1.start_id) in the CTE that builds s3 (references: n3, e1, s1, s3).
🧹 Nitpick comments (1)
cypher/models/pgsql/translate/function.go (1)
309-335: Observation: Min/Max lack outer CastType, unlike Sum/Avg.The Min and Max functions apply the Float8 input cast but don't set
CastTypeon the resultingFunctionCall, whereas Sum and Avg setCastType: pgsql.Numeric. This appears intentional sincemin()/max()preserve input type semantics (returning Float8), whilesum()/avg()benefit from Numeric precision.If this asymmetry is intentional for semantic correctness, consider adding a brief comment explaining why Min/Max don't need the outer Numeric cast.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/translate/function.go` around lines 309 - 335, Min and Max handlers (cypher.MinFunction, cypher.MaxFunction) cast the argument to Float8 via TypeCastExpression but do not set the FunctionCall.CastType (unlike Sum/Avg which set CastType: pgsql.Numeric); add a short clarifying comment above the Min/Max cases explaining that min()/max() intentionally preserve the input's Float8 type and therefore do not require the outer pgsql.Numeric CastType, and mention why Sum/Avg differ (numeric aggregation precision) to prevent future confusion when editing the FunctionCall construction in these branches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cypher/models/pgsql/test/translation_cases/multipart.sql`:
- Around line 47-48: The recursive CTE s2 currently seeds at depth 1 and never
produces the zero-length path for the *0.. traversal, so nodes matching the
'%-516' predicate never populate the exclude collection; to fix, add a zero-hop
seed row to the s2 CTE (the same CTE declared as s2(root_id, next_id, depth,
satisfied, is_cycle, path)) that sets depth = 0, next_id = root_id (or both
start/end = the matching node), path = '{}' (empty array), satisfied = true when
the node matches ((n1.properties ->> 'objectid') like '%-516') and is_cycle =
true if appropriate; keep the existing recursive step unchanged so you still
collect longer paths and ensure s1/s0 aggregate will include the zero-length
node for the exclude check used later.
In `@cypher/models/pgsql/test/translation_cases/update.sql`:
- Around line 50-51: The bug is using UPDATE ... FROM ... RETURNING as row
carriers (CTEs s1/s2) which collapses one-to-many matches; instead, keep the
original match set in s0 and do updates in two phases: 1) collect matched rows
in s0 (already present), 2) perform UPDATE node AS n2 SET ... WHERE id IN
(select distinct (s0.n0).id from s0) RETURNING n2.id, n2.properties into a new
CTE (e.g., updated_nodes), then join s0 to updated_nodes on id to rehydrate one
row per original match, and finally perform the edge UPDATE using ids from that
rehydrated set (not the UPDATE returning rows) storing its RETURNING into
updated_edges; apply the same pattern for the edge update (use distinct ids from
the rehydrated rows, UPDATE edge AS e1 WHERE id IN (...) RETURNING ... into
updated_edges) so no match multiplicity is lost (refer to CTE names s0,
updated_nodes, updated_edges and tables node/edge and columns id, start_id,
end_id, properties).
In `@cypher/models/pgsql/translate/traversal.go`:
- Around line 12-28: The function buildDirectionlessTraversalPatternRoot still
always joins the left node which causes an unused joined node and
row-multiplication when traversalStep.LeftNode is already bound; mirror the
LeftNodeBound handling from buildTraversalPatternRoot: detect when
traversalStep.LeftNodeBound (or when RewriteFrameBindings can satisfy
traversalStep.LeftNodeJoinCondition from the previous frame and
traversalStep.Edge), then join the edge/table from the prior frame and avoid
materializing the left-node table here (only materialize the terminal/right node
and adjust nextSelect/Projection accordingly); update the join creation logic
that uses traversalStep.LeftNode, traversalStep.Edge,
traversalStep.LeftNodeJoinCondition and nextSelect so the left node is not added
to the FROM/JOIN list when LeftNodeBound is true and ensure constraints use the
edge+previous frame identifiers instead.
---
Duplicate comments:
In `@container/triplestore.go`:
- Around line 282-290: In EachAdjacentEdge on triplestoreProjection you’re
mistakenly checking next.Start twice; update the deletedNodes checks so the
second check uses next.End (i.e., ensure the condition is
!s.deletedEdges.Contains(next.ID) && !s.deletedNodes.Contains(next.Start) &&
!s.deletedNodes.Contains(next.End)) so edges adjacent to deleted end nodes are
filtered correctly; modify the anonymous func passed to
s.origin.EachAdjacentEdge accordingly.
In `@cypher/models/pgsql/test/translation_cases/multipart.sql`:
- Around line 29-30: The COUNT aggregate lost its DISTINCT qualifier causing
duplicate n values to be overcounted; update the aggregate in the final CTE from
count((s3.n0)) (in the SELECT inside CTE s0 / s3 -> producing i0) to
count(DISTINCT (s3.n0)) so it matches the original intent count(distinct(n)) —
locate the COUNT expression producing i0 (count((s3.n0))::int8 as i0 in the s3
-> s0 selection) and add DISTINCT to the count.
- Around line 59-60: The ON predicate for the join with node n3 is missing
parentheses so the expression is parsed as (n3.kind_ids @> ... AND n3.id =
e1.end_id) OR n3.id = e1.start_id, weakening the g:NodeKind1 constraint; fix by
parenthesizing the id check so the kind_ids requirement applies to both
endpoints—e.g. wrap the OR as (n3.id = e1.end_id OR n3.id = e1.start_id) or wrap
the whole right-hand side so the join clause for n3 reads n3.kind_ids operator
(...) AND (n3.id = e1.end_id OR n3.id = e1.start_id) in the CTE that builds s3
(references: n3, e1, s1, s3).
In `@cypher/models/pgsql/translate/constraints.go`:
- Around line 29-43: terminalNodeConstraint returns a DirectionBoth predicate
composed as "A OR B" without grouping, which can escape surrounding AND filters
when combined (see rightNodeTraversalStepJoinCondition and pgsql.OptionalAnd);
wrap the DirectionBoth predicate in pgsql.NewParenthetical(...) wherever it is
built (e.g., in terminalNodeConstraint and the similar constructions at the
other locations mentioned) so the OR expression is parenthesized and cannot
bypass adjacent filters.
---
Nitpick comments:
In `@cypher/models/pgsql/translate/function.go`:
- Around line 309-335: Min and Max handlers (cypher.MinFunction,
cypher.MaxFunction) cast the argument to Float8 via TypeCastExpression but do
not set the FunctionCall.CastType (unlike Sum/Avg which set CastType:
pgsql.Numeric); add a short clarifying comment above the Min/Max cases
explaining that min()/max() intentionally preserve the input's Float8 type and
therefore do not require the outer pgsql.Numeric CastType, and mention why
Sum/Avg differ (numeric aggregation precision) to prevent future confusion when
editing the FunctionCall construction in these branches.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 80a70886-2ff9-45fe-9fae-79d607458564
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (22)
container/triplestore.gocypher/models/pgsql/test/testcase.gocypher/models/pgsql/test/translation_cases/multipart.sqlcypher/models/pgsql/test/translation_cases/nodes.sqlcypher/models/pgsql/test/translation_cases/pattern_binding.sqlcypher/models/pgsql/test/translation_cases/pattern_expansion.sqlcypher/models/pgsql/test/translation_cases/quantifiers.sqlcypher/models/pgsql/test/translation_cases/scalar_aggregation.sqlcypher/models/pgsql/test/translation_cases/stepwise_traversal.sqlcypher/models/pgsql/test/translation_cases/update.sqlcypher/models/pgsql/test/validation_integration_test.gocypher/models/pgsql/translate/constraints.gocypher/models/pgsql/translate/expansion.gocypher/models/pgsql/translate/expression.gocypher/models/pgsql/translate/function.gocypher/models/pgsql/translate/model.gocypher/models/pgsql/translate/predicate.gocypher/models/pgsql/translate/projection.gocypher/models/pgsql/translate/traversal.gocypher/models/walk/walk_pgsql.godrivers/pg/query/sql/schema_up.sqlgo.mod
✅ Files skipped from review due to trivial changes (5)
- cypher/models/pgsql/test/testcase.go
- drivers/pg/query/sql/schema_up.sql
- cypher/models/pgsql/translate/expansion.go
- go.mod
- cypher/models/pgsql/translate/model.go
🚧 Files skipped from review as they are similar to previous changes (4)
- cypher/models/walk/walk_pgsql.go
- cypher/models/pgsql/test/translation_cases/stepwise_traversal.sql
- cypher/models/pgsql/test/translation_cases/scalar_aggregation.sql
- cypher/models/pgsql/test/translation_cases/pattern_binding.sql
…oin predicate placement - BED-6499 - fix: correct `triplestoreProjection.EachEdge` to check `next.End` instead of `next.Start` twice, preventing deleted end-nodes from being included in edge iteration - fix: change recursive CTE depth condition from `<= N` to `< N` across all multipart traversal queries to enforce consistent exclusive upper-bound semantics - fix: move `not c in exclude` filter out of the JOIN condition and into a final WHERE clause for collect/exclude path patterns, correcting result filtering - fix: update edge accumulation in chained traversal to avoid reordering the final leg of a path - fix: node kind grouping for bidirecitonal matches missing a parenthetical wrapper - refactor: push node kind and property predicates from WHERE clauses into JOIN ON conditions throughout generated SQL for improved query planning - test: replace `require.Nil` with `require.NoError` in `AssertLive` for clearer assertion semantics - test: add additional expected SQL output cases for multipart, collect/exclude, and samaccountname translation scenarios - chore: update go.mod, go.sum - feat: add new btree index to accelerate fused kind and edge start or end node ID lookups
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (7)
cypher/models/pgsql/translate/constraints.go (1)
152-165:⚠️ Potential issue | 🔴 Critical
terminalNodeConstraintstill missing parenthesization forDirectionBoth.This function returns an OR-based expression that is not wrapped in
pgsql.NewParenthetical, while the analogousleftNodeConstraintandrightEdgeConstraintwere fixed. When combined with other predicates viaOptionalAnd, SQL operator precedence will cause incorrect filtering.Suggested fix
case graph.DirectionBoth: - return pgsql.NewBinaryExpression( + return pgsql.NewParenthetical( + pgsql.NewBinaryExpression( pgsql.NewBinaryExpression( pgsql.CompoundIdentifier{nodeIdentifier, pgsql.ColumnID}, pgsql.OperatorEquals, pgsql.CompoundIdentifier{edgeIdentifier, pgsql.ColumnEndID}, ), pgsql.OperatorOr, pgsql.NewBinaryExpression( pgsql.CompoundIdentifier{nodeIdentifier, pgsql.ColumnID}, pgsql.OperatorEquals, pgsql.CompoundIdentifier{edgeIdentifier, pgsql.ColumnStartID}, ), - ), nil + ), + ), nil🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/translate/constraints.go` around lines 152 - 165, terminalNodeConstraint is returning an OR expression for graph.DirectionBoth without wrapping it in pgsql.NewParenthetical, causing operator precedence issues when combined via OptionalAnd; update the DirectionBoth branch in terminalNodeConstraint to wrap the pgsql.NewBinaryExpression(...) OR expression with pgsql.NewParenthetical(...) (matching how leftNodeConstraint and rightEdgeConstraint were fixed) so the combined predicate is parenthesized before returning.cypher/models/pgsql/test/validation_integration_test.go (1)
68-71:⚠️ Potential issue | 🟡 MinorTypo in variable name:
cassesPassedshould becasesPassed.🔤 Proposed fix
var ( casesRun = 0 - cassesPassed = 0 + casesPassed = 0 )Also update references on lines 89 and 96.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/test/validation_integration_test.go` around lines 68 - 71, Rename the misspelled variable cassesPassed to casesPassed in the test scope where it is declared and update every reference to that identifier (e.g., increments and assertions) so they use casesPassed instead; ensure the variable name is consistently used throughout the test (including places that currently reference cassesPassed).cypher/models/pgsql/translate/traversal.go (1)
12-28:⚠️ Potential issue | 🟠 MajorDirectionless roots still need a
LeftNodeBoundbranch.Line 30 already detects a previous frame, but this builder still always emits
edge JOIN node left JOIN node right. When the left node is already bound, that can leave theleftjoin unused and multiply rows bynodecardinality for bound directionless traversals. Mirror the bound-join shape frombuildTraversalPatternRoothere.Also applies to: 30-62
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/translate/traversal.go` around lines 12 - 28, The builder buildDirectionlessTraversalPatternRoot always emits an unbound left Join even when traversalStep.LeftNodeBound is true, causing an unused left join and row multiplication; update this function to mirror the bound-join shape used in buildTraversalPatternRoot: detect traversalStep.LeftNodeBound and, when true, emit the join order that treats the left node as already bound (skip or convert the left join into a bound-join shape) while still applying partitionConstraintByLocality results (leftJoinLocal/leftJoinExternal and rightJoinLocal/rightJoinExternal) and preserving nextSelect. Ensure you reuse the same join construction logic/shape from buildTraversalPatternRoot so bound directionless traversals do not create the extra left JOIN.cypher/models/pgsql/test/translation_cases/multipart.sql (3)
59-60:⚠️ Potential issue | 🟠 MajorGroup the
n3endpoint disjunction.
n3.kind_ids ... and n3.id = e1.end_id or n3.id = e1.start_idonly constrains theend_idbranch. The undirectedg:NodeKind1match becomes too permissive on thestart_idside.Suggested fix
-join node n3 on n3.kind_ids operator (pg_catalog.@>) array [1]::int2[] and n3.id = e1.end_id or n3.id = e1.start_id +join node n3 on n3.kind_ids operator (pg_catalog.@>) array [1]::int2[] and (n3.id = e1.end_id or n3.id = e1.start_id)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/test/translation_cases/multipart.sql` around lines 59 - 60, The join condition for n3 in the s3 CTE is missing parentheses around the OR, so the "n3.kind_ids ... and n3.id = e1.end_id or n3.id = e1.start_id" only applies the kind_ids check to the end_id branch; update the join in the s3 CTE (the "join node n3 on ..." clause) to group the disjunction so the kind_ids check applies to both id branches — e.g., change it to require n3.kind_ids ... AND (n3.id = e1.end_id OR n3.id = e1.start_id) so the undirected match for g:NodeKind1 is correctly constrained.
47-48:⚠️ Potential issue | 🟠 Major
*0..still needs a zero-hop seed.
s2still starts at depth 1, so the zero-lengthdc = gmatch never reachesexclude. That changes the finalnot c in excludefilter whenever a-516node should exclude itself.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/test/translation_cases/multipart.sql` around lines 47 - 48, The recursive CTE s2 currently seeds at depth 1 so zero-hop matches are omitted; modify the s2 seed to include a zero-length seed (root_id = e0.end_id, next_id = e0.end_id, depth = 0, satisfied = true, is_cycle = false, path = '{}' or empty array) or add an explicit initial SELECT that represents the zero-hop case before the existing union select, so that s0 (and ultimately exclude) contains the dc nodes equal to g; update any downstream assumptions in s0/s3 that expect non-empty path accordingly.
29-30:⚠️ Potential issue | 🟠 MajorUse
DISTINCTin this aggregate.The case is
count(distinct(n)), but the fixture still expects plaincount((s3.n0)). That overcounts duplicatenbindings that reach the samemthrough multiple paths and blesses the wrong translation.Suggested fix
-select s3.n2 as n2, count((s3.n0))::int8 as i0 from s3 group by n2 +select s3.n2 as n2, count(distinct (s3.n0))::int8 as i0 from s3 group by n2🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/test/translation_cases/multipart.sql` around lines 29 - 30, The aggregate in the s3→s0 pipeline currently uses count((s3.n0)) which overcounts duplicates; change the aggregation in the query fragment "select s3.n2 as n2, count((s3.n0))::int8 as i0 from s3 group by n2" to use a DISTINCT qualifier on s3.n0 (e.g. count(distinct (s3.n0))::int8 as i0) so the computed i0 matches count(distinct(n)) and downstream filters in s0 behave correctly.cypher/models/pgsql/test/translation_cases/pattern_binding.sql (1)
53-54:⚠️ Potential issue | 🟠 MajorThe
n1join still loses precedence.
n0is parenthesized now, butn1.kind_ids ... and n1.id = e0.end_id or n1.id = e0.start_idstill lets thestart_idbranch skip theNodeKind1filter.Suggested fix
-join node n1 on n1.kind_ids operator (pg_catalog.@>) array [1]::int2[] and n1.id = e0.end_id or n1.id = e0.start_id +join node n1 on n1.kind_ids operator (pg_catalog.@>) array [1]::int2[] and (n1.id = e0.end_id or n1.id = e0.start_id)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/test/translation_cases/pattern_binding.sql` around lines 53 - 54, The join for n1 loses precedence because the OR isn't parenthesized, allowing the "n1.id = e0.start_id" branch to bypass the kind_ids filter; update the n1 join condition to mirror n0 by grouping the kind_ids check with the id disjunction, e.g. change "join node n1 on n1.kind_ids operator (pg_catalog.@>) array [1]::int2[] and n1.id = e0.end_id or n1.id = e0.start_id" to use parentheses so the condition reads like "(n1.kind_ids operator (pg_catalog.@>) array [1]::int2[] and (n1.id = e0.end_id or n1.id = e0.start_id))", ensuring the NodeKind1 filter applies to both id branches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cypher/models/pgsql/test/translation_cases/nodes.sql`:
- Around line 189-190: terminalNodeConstraint's graph.DirectionBoth branch
currently returns an unwrapped BinaryExpression (OR) which breaks operator
precedence when combined with ANDed property/kind filters; change that branch to
wrap the BinaryExpression in pgsql.NewParenthetical(), mirroring the pattern
used in leftNodeConstraint and rightEdgeConstraint so expressions like
"(n1.properties ->> 'name') = 'n3' and (n1.id = e0.end_id or n1.id =
e0.start_id)" are correctly grouped; update the graph.DirectionBoth case inside
terminalNodeConstraint to return pgsql.NewParenthetical(<the existing
BinaryExpression>) instead of the raw BinaryExpression.
In `@cypher/models/walk/walk_pgsql.go`:
- Around line 255-258: The RowColumnReference branch only walks
typedNode.Identifier but must also walk the column identifier; in the case
handling for pgsql.RowColumnReference (returning &Cursor[pgsql.SyntaxNode]{Node:
node, Branches: []pgsql.SyntaxNode{typedNode.Identifier}}) update the Branches
list to include typedNode.Column as well (i.e., include both
typedNode.Identifier and typedNode.Column) so identifier-collection and rewrite
passes see the column side of row.column references.
- Around line 92-96: The traversal for pgsql.AliasedExpression currently only
adds typedNode.Expression to Cursor.Branches and omits the Alias, causing alias
identifiers to be skipped; update both case handlers for
*pgsql.AliasedExpression (the ones returning &Cursor[pgsql.SyntaxNode]{...}
around the shown diff and the similar block at lines ~104-108) to include
typedNode.Alias in the Branches slice (e.g., Branches:
[]pgsql.SyntaxNode{typedNode.Expression, typedNode.Alias}) so the generic walker
will visit the Alias (pgsql.Identifier) as well.
---
Duplicate comments:
In `@cypher/models/pgsql/test/translation_cases/multipart.sql`:
- Around line 59-60: The join condition for n3 in the s3 CTE is missing
parentheses around the OR, so the "n3.kind_ids ... and n3.id = e1.end_id or
n3.id = e1.start_id" only applies the kind_ids check to the end_id branch;
update the join in the s3 CTE (the "join node n3 on ..." clause) to group the
disjunction so the kind_ids check applies to both id branches — e.g., change it
to require n3.kind_ids ... AND (n3.id = e1.end_id OR n3.id = e1.start_id) so the
undirected match for g:NodeKind1 is correctly constrained.
- Around line 47-48: The recursive CTE s2 currently seeds at depth 1 so zero-hop
matches are omitted; modify the s2 seed to include a zero-length seed (root_id =
e0.end_id, next_id = e0.end_id, depth = 0, satisfied = true, is_cycle = false,
path = '{}' or empty array) or add an explicit initial SELECT that represents
the zero-hop case before the existing union select, so that s0 (and ultimately
exclude) contains the dc nodes equal to g; update any downstream assumptions in
s0/s3 that expect non-empty path accordingly.
- Around line 29-30: The aggregate in the s3→s0 pipeline currently uses
count((s3.n0)) which overcounts duplicates; change the aggregation in the query
fragment "select s3.n2 as n2, count((s3.n0))::int8 as i0 from s3 group by n2" to
use a DISTINCT qualifier on s3.n0 (e.g. count(distinct (s3.n0))::int8 as i0) so
the computed i0 matches count(distinct(n)) and downstream filters in s0 behave
correctly.
In `@cypher/models/pgsql/test/translation_cases/pattern_binding.sql`:
- Around line 53-54: The join for n1 loses precedence because the OR isn't
parenthesized, allowing the "n1.id = e0.start_id" branch to bypass the kind_ids
filter; update the n1 join condition to mirror n0 by grouping the kind_ids check
with the id disjunction, e.g. change "join node n1 on n1.kind_ids operator
(pg_catalog.@>) array [1]::int2[] and n1.id = e0.end_id or n1.id = e0.start_id"
to use parentheses so the condition reads like "(n1.kind_ids operator
(pg_catalog.@>) array [1]::int2[] and (n1.id = e0.end_id or n1.id =
e0.start_id))", ensuring the NodeKind1 filter applies to both id branches.
In `@cypher/models/pgsql/test/validation_integration_test.go`:
- Around line 68-71: Rename the misspelled variable cassesPassed to casesPassed
in the test scope where it is declared and update every reference to that
identifier (e.g., increments and assertions) so they use casesPassed instead;
ensure the variable name is consistently used throughout the test (including
places that currently reference cassesPassed).
In `@cypher/models/pgsql/translate/constraints.go`:
- Around line 152-165: terminalNodeConstraint is returning an OR expression for
graph.DirectionBoth without wrapping it in pgsql.NewParenthetical, causing
operator precedence issues when combined via OptionalAnd; update the
DirectionBoth branch in terminalNodeConstraint to wrap the
pgsql.NewBinaryExpression(...) OR expression with pgsql.NewParenthetical(...)
(matching how leftNodeConstraint and rightEdgeConstraint were fixed) so the
combined predicate is parenthesized before returning.
In `@cypher/models/pgsql/translate/traversal.go`:
- Around line 12-28: The builder buildDirectionlessTraversalPatternRoot always
emits an unbound left Join even when traversalStep.LeftNodeBound is true,
causing an unused left join and row multiplication; update this function to
mirror the bound-join shape used in buildTraversalPatternRoot: detect
traversalStep.LeftNodeBound and, when true, emit the join order that treats the
left node as already bound (skip or convert the left join into a bound-join
shape) while still applying partitionConstraintByLocality results
(leftJoinLocal/leftJoinExternal and rightJoinLocal/rightJoinExternal) and
preserving nextSelect. Ensure you reuse the same join construction logic/shape
from buildTraversalPatternRoot so bound directionless traversals do not create
the extra left JOIN.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4781e1bd-b6b7-4dc8-ac9a-f19be8ead553
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (21)
container/triplestore.gocypher/models/pgsql/test/testcase.gocypher/models/pgsql/test/translation_cases/multipart.sqlcypher/models/pgsql/test/translation_cases/nodes.sqlcypher/models/pgsql/test/translation_cases/pattern_binding.sqlcypher/models/pgsql/test/translation_cases/pattern_expansion.sqlcypher/models/pgsql/test/translation_cases/quantifiers.sqlcypher/models/pgsql/test/translation_cases/scalar_aggregation.sqlcypher/models/pgsql/test/translation_cases/stepwise_traversal.sqlcypher/models/pgsql/test/translation_cases/update.sqlcypher/models/pgsql/test/validation_integration_test.gocypher/models/pgsql/translate/constraints.gocypher/models/pgsql/translate/expansion.gocypher/models/pgsql/translate/expression.gocypher/models/pgsql/translate/function.gocypher/models/pgsql/translate/model.gocypher/models/pgsql/translate/predicate.gocypher/models/pgsql/translate/projection.gocypher/models/pgsql/translate/traversal.gocypher/models/walk/walk_pgsql.gogo.mod
✅ Files skipped from review due to trivial changes (4)
- cypher/models/pgsql/translate/expansion.go
- cypher/models/pgsql/test/testcase.go
- cypher/models/pgsql/translate/model.go
- go.mod
🚧 Files skipped from review as they are similar to previous changes (5)
- container/triplestore.go
- cypher/models/pgsql/translate/projection.go
- cypher/models/pgsql/test/translation_cases/stepwise_traversal.sql
- cypher/models/pgsql/test/translation_cases/update.sql
- cypher/models/pgsql/test/translation_cases/quantifiers.sql
There was a problem hiding this comment.
♻️ Duplicate comments (1)
cypher/models/pgsql/translate/traversal.go (1)
30-60:⚠️ Potential issue | 🟠 MajorStill missing bound-node handling in the directionless root.
Lines 38-60 still always materialize both node tables. When
traversalStep.LeftNodeBoundortraversalStep.RightNodeBoundis true, the corresponding join can be satisfied from the previous frame plusedge, so the extraJOIN node ...stops constraining that alias and multiplies rows. Mirror the bound-node branching frombuildTraversalPatternRoothere instead of always joining bothn0andn1.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cypher/models/pgsql/translate/traversal.go` around lines 30 - 60, The current traversal root always adds JOINs for both node tables (pgsql.TableNode) in nextSelect.From causing row multiplication when a node is bound; update the logic in the block that builds nextSelect.From (the code that appends the pgsql.FromClause with TableEdge and its Joins) to mirror the branching in buildTraversalPatternRoot: check traversalStep.LeftNodeBound and traversalStep.RightNodeBound and, for any bound side, avoid materializing the corresponding JOIN node table and instead satisfy that join from the previous frame + edge (use previousValidFrame(traversalStep.Frame) and the edge binding), while for unbound sides retain the existing JOIN with JoinOperator.Constraint set to pgsql.OptionalAnd(leftJoinLocal, traversalStep.LeftNodeJoinCondition) or pgsql.OptionalAnd(rightJoinLocal, traversalStep.RightNodeJoinCondition) respectively; ensure you still set the TableReference.Binding to traversalStep.LeftNode.Identifier / traversalStep.RightNode.Identifier where appropriate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@cypher/models/pgsql/translate/traversal.go`:
- Around line 30-60: The current traversal root always adds JOINs for both node
tables (pgsql.TableNode) in nextSelect.From causing row multiplication when a
node is bound; update the logic in the block that builds nextSelect.From (the
code that appends the pgsql.FromClause with TableEdge and its Joins) to mirror
the branching in buildTraversalPatternRoot: check traversalStep.LeftNodeBound
and traversalStep.RightNodeBound and, for any bound side, avoid materializing
the corresponding JOIN node table and instead satisfy that join from the
previous frame + edge (use previousValidFrame(traversalStep.Frame) and the edge
binding), while for unbound sides retain the existing JOIN with
JoinOperator.Constraint set to pgsql.OptionalAnd(leftJoinLocal,
traversalStep.LeftNodeJoinCondition) or pgsql.OptionalAnd(rightJoinLocal,
traversalStep.RightNodeJoinCondition) respectively; ensure you still set the
TableReference.Binding to traversalStep.LeftNode.Identifier /
traversalStep.RightNode.Identifier where appropriate.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 94c39b79-23cf-4d07-8359-01aedfb06286
📒 Files selected for processing (4)
cypher/models/pgsql/test/translation_cases/multipart.sqlcypher/models/pgsql/test/validation_integration_test.gocypher/models/pgsql/translate/expansion.gocypher/models/pgsql/translate/traversal.go
✅ Files skipped from review due to trivial changes (1)
- cypher/models/pgsql/translate/expansion.go
🚧 Files skipped from review as they are similar to previous changes (2)
- cypher/models/pgsql/test/validation_integration_test.go
- cypher/models/pgsql/test/translation_cases/multipart.sql
37fbefe to
e7e8f51
Compare
urangel
left a comment
There was a problem hiding this comment.
The scope locality utilities are great! Thanks for all of the comments and fixes.
Tickets
Summary by CodeRabbit
Bug Fixes
New Features
Tests
Chores
Refactor
Summary by CodeRabbit
Bug Fixes
Improvements
Tests
Chores