Skip to content

[BugFix] Fix appendcol subquery cannot resolve MAP sub-paths after spath (#5186)#5304

Draft
qianheng-aws wants to merge 1 commit intoopensearch-project:mainfrom
qianheng-aws:fix/appendcol-spath-5186
Draft

[BugFix] Fix appendcol subquery cannot resolve MAP sub-paths after spath (#5186)#5304
qianheng-aws wants to merge 1 commit intoopensearch-project:mainfrom
qianheng-aws:fix/appendcol-spath-5186

Conversation

@qianheng-aws
Copy link
Copy Markdown
Collaborator

Description

When spath auto-extract transforms a STRING field to MAP, the appendcol subquery could not resolve dotted MAP paths like doc.user.city. The subquery failed with INTERNAL_ITEM function expects {[ARRAY,INTEGER]|[STRUCT,ANY]}, but got [STRING,STRING].

Root cause: visitAppendCol attached only the raw Relation to the subsearch via getRelation() + transformPlanToAttachChild(), missing schema-transforming commands like spath. Additionally, PPLQueryDataAnonymizer.visitAppendCol runs first and attaches a raw Relation, and Relation.attach() is a no-op, so subsequent attachment attempts in the visitor silently failed.

Fix: buildAppendColSubsearchSource() collects SPath nodes from the main plan chain and builds a new SPath -> Relation chain. The subsearch attachment walks to find the parent of any existing Relation and re-attaches with the full source chain including spath transformations.

Related Issues

Resolves #5186

Check List

  • New functionality includes testing
  • Commits signed per DCO (-s)
  • spotlessCheck passed
  • Unit tests passed
  • Integration tests passed
  • Grammar changes: all three .g4 files synced (N/A - no grammar changes)

…ath (opensearch-project#5186)

When spath auto-extract transforms a STRING field to MAP, the appendcol
subquery could not resolve dotted paths like doc.user.city because it
operated on the raw source table where doc is still STRING.

Root cause: visitAppendCol attached only the raw Relation to the
subsearch via getRelation()+transformPlanToAttachChild(), missing
schema-transforming commands like spath. Additionally, the
PPLQueryDataAnonymizer.visitAppendCol runs first and attaches a raw
Relation, and Relation.attach() is a no-op, so subsequent attachment
attempts silently failed.

Fix: buildAppendColSubsearchSource() collects SPath nodes from the main
plan chain and builds a new SPath->Relation chain. The subsearch
attachment walks to find the parent of any existing Relation and
re-attaches with the full source chain including spath transformations.

Signed-off-by: Heng Qian <qianheng@amazon.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
@qianheng-aws
Copy link
Copy Markdown
Collaborator Author

Decision Log

Root Cause: Two interacting issues caused the bug:

  1. visitAppendCol used getRelation(node) to extract only the raw Relation (table scan) from the main plan, discarding schema-transforming commands like spath that convert STRING fields to MAP.
  2. PPLQueryDataAnonymizer.visitAppendCol runs before CalciteRelNodeVisitor.visitAppendCol and attaches a raw Relation to the subsearch via transformPlanToAttachChild(). Since Relation.attach() is a no-op (returns this without setting a child), any subsequent attempt to attach a different source to the subsearch's leaf silently failed — the Relation was already there but couldn't be replaced through the leaf.

Approach:

  • buildAppendColSubsearchSource() walks the main plan's child chain, collects SPath nodes, and builds a fresh SPath(copy) -> Relation chain.
  • Instead of using transformPlanToAttachChild() (which finds the leaf and attaches, but fails when the leaf is an immutable Relation), the fix walks the subsearch to find the parent whose child is a Relation, then calls parent.attach(subsearchSource) to replace the raw Relation with the SPath-enhanced source.

Alternatives Rejected:

  1. Attaching the full child plan (all commands between Relation and AppendCol): This broke existing tests because schema-reducing commands like stats or eval would restrict the available fields for the subsearch.
  2. Using transformPlanToAttachChild with SPath copies: The anonymizer pre-attaches a raw Relation to the subsearch. transformPlanToAttachChild finds the Relation as the leaf and calls Relation.attach(spathCopy) — which is a no-op. The SPath was silently discarded.

Pitfalls:

  • Relation.attach() is a no-op by design (leaf nodes don't accept children). Any code that relies on transformPlanToAttachChild after the anonymizer has already attached a Relation will silently fail.
  • The anonymizer's visitAppendCol mutates the AST (attaches Relation to subsearch) as a side effect. This creates an ordering dependency between the anonymizer and the visitor.

Things to Watch:

  • If new schema-transforming commands are added (similar to spath), they would need to be added to buildAppendColSubsearchSource().
  • The anonymizer's visitAppendCol still uses getRelation + transformPlanToAttachChild — this is fine for anonymization purposes but should be noted if the anonymizer's behavior changes.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Fix appendcol subsearch spath propagation with unit tests

Relevant files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAppendcolTest.java

Sub-PR theme: Add integration and YAML REST tests for spath + appendcol fix

Relevant files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLSpathCommandIT.java
  • integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5186.yml

⚡ Recommended focus areas for review

SPath Order Bug

In buildAppendColSubsearchSource, SPath nodes are collected top-down (from AppendCol toward Relation) and then applied in the same order, wrapping each new SPath around the previous source. This means the first SPath encountered (closest to AppendCol) ends up as the outermost node, reversing the original chain order. If multiple spath commands are chained, the resulting subsearch source will have them in reverse order compared to the main plan, potentially causing incorrect schema transformations.

UnresolvedPlan source = relation;
for (SPath spath : spathNodes) {
  SPath copy = new SPath(spath.getInField(), spath.getOutField(), spath.getPath());
  copy.attach(source);
  source = copy;
}
return source;
Shared Mutable AST

Fresh SPath copies are created to avoid sharing mutable AST nodes, but getRelation(node) returns the same Relation object used by the main plan. If Relation is mutable and attach() or other operations modify it, sharing it between the main plan and the subsearch source chain could cause subtle bugs.

UnresolvedPlan relation = getRelation(node);
// Walk the child chain from AppendCol down to the Relation, collecting SPath nodes
List<SPath> spathNodes = new ArrayList<>();
UnresolvedPlan current = (UnresolvedPlan) node.getChild().getFirst();
while (current != null) {
  if (current instanceof SPath spath) {
    spathNodes.add(spath);
  }
  if (current.getChild() != null && !current.getChild().isEmpty()) {
    current = (UnresolvedPlan) current.getChild().getFirst();
  } else {
    break;
  }
}
if (spathNodes.isEmpty()) {
  return relation;
}
Fragile Walk Logic

The loop in visitAppendCol that walks down to find the parent of a Relation node uses getFirst() and checks instanceof Relation. If the subsearch has a more complex structure (e.g., multiple children, or no Relation child at all), the loop may terminate at the wrong node or call attach on an unexpected node, silently producing incorrect results.

UnresolvedPlan parent = node.getSubSearch();
while (parent.getChild() != null && !parent.getChild().isEmpty()) {
  UnresolvedPlan child = (UnresolvedPlan) parent.getChild().getFirst();
  if (child instanceof Relation) {
    // Parent's child is the Relation -- replace it with our source chain
    break;
  }
  parent = child;
}
parent.attach(subsearchSource);

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Reverse SPath chain reconstruction order

The spathNodes list is built by walking the child chain top-down, so the first
element is the outermost SPath and the last is the innermost (closest to the
Relation). When rebuilding the chain, iterating in the same order wraps each
subsequent SPath around the previous one, resulting in a reversed chain where the
outermost SPath ends up innermost. The list should be iterated in reverse order so
the chain is reconstructed correctly (innermost SPath attached to Relation first,
outermost last).

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [2570-2574]

-for (SPath spath : spathNodes) {
+for (int i = spathNodes.size() - 1; i >= 0; i--) {
+  SPath spath = spathNodes.get(i);
   SPath copy = new SPath(spath.getInField(), spath.getOutField(), spath.getPath());
   copy.attach(source);
   source = copy;
 }
Suggestion importance[1-10]: 7

__

Why: This is a valid logical concern - the spathNodes list is built top-down, so iterating in the same order when rebuilding would reverse the chain. However, the test in CalcitePPLAppendcolTest passes with the current implementation, suggesting either the order doesn't matter for single SPath nodes (the common case) or the existing tests don't cover multiple SPath nodes. For correctness with multiple SPath nodes, reversing the iteration order is the right approach.

Medium
Handle edge cases in subsearch traversal loop

The loop walks down the child chain and breaks when it finds a node whose child is a
Relation, but it never handles the case where the subsearch itself has no children
(empty or null). In that scenario, parent remains as node.getSubSearch() and attach
is called on it, which may be incorrect. Additionally, if the deepest node is not a
Relation (e.g., the chain ends without one), the loop exits without breaking and
parent ends up at the leaf, which may not be the intended attachment point.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [2442-2451]

 UnresolvedPlan parent = node.getSubSearch();
 while (parent.getChild() != null && !parent.getChild().isEmpty()) {
   UnresolvedPlan child = (UnresolvedPlan) parent.getChild().getFirst();
   if (child instanceof Relation) {
     // Parent's child is the Relation -- replace it with our source chain
     break;
   }
+  if (child.getChild() == null || child.getChild().isEmpty()) {
+    // child is the deepest non-Relation node; attach here
+    parent = child;
+    break;
+  }
   parent = child;
 }
 parent.attach(subsearchSource);
Suggestion importance[1-10]: 4

__

Why: The suggestion identifies a real edge case where the loop exits without finding a Relation child, leaving parent at the leaf node. However, the existing loop already handles this correctly by natural termination - when parent.getChild() is null or empty, the while condition fails and parent is already at the deepest node. The improved code adds an explicit inner check that is largely redundant with the outer while condition, making this a minor improvement at best.

Low

@qianheng-aws qianheng-aws added bug Something isn't working bugFix labels Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working bugFix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] appendcol subquery cannot resolve MAP sub-paths

1 participant