Skip to content

Wire analytics-engine as extendedPlugins dependency#5302

Open
ahkcs wants to merge 7 commits intofeature/mustang-ppl-integrationfrom
pr/mustang-plugin-wiring
Open

Wire analytics-engine as extendedPlugins dependency#5302
ahkcs wants to merge 7 commits intofeature/mustang-ppl-integrationfrom
pr/mustang-plugin-wiring

Conversation

@ahkcs
Copy link
Copy Markdown
Collaborator

@ahkcs ahkcs commented Apr 1, 2026

Summary

  • Wire analytics-engine as extendedPlugins parent classloader for shared Calcite classes
  • Add bundlePlugin excludes to prevent jar hell between parent/child classloaders
  • Add analytics-engine plugin to all test clusters (integTest, yamlRestTest, security, JDBC, doctest)
  • Add commons-text to analytics-engine ZIP for Calcite fuzzy matching support
  • Fix isAnalyticsIndex to use lightweight parsing context (no cluster state needed)

Test plan

  • All existing integration tests pass with analytics-engine loaded
  • RestUnifiedQueryActionTest passes (routing logic)
  • Doctests pass (no ClassNotFoundException for LevenshteinDistance)
  • No jar hell errors on plugin startup

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 0ad18a8.

PathLineSeverityDescription
libs/analytics-framework-3.6.0-SNAPSHOT.jar1highUnverifiable binary JAR committed directly to the repository. Contents cannot be audited via code review. Supply chain risk — maintainers must verify the artifact's provenance and integrity before merging.
libs/analytics-engine-3.6.0-SNAPSHOT.jar1highUnverifiable binary JAR committed directly to the repository. Contents cannot be audited via code review. Supply chain risk — maintainers must verify the artifact's provenance and integrity before merging.
libs/analytics-engine-3.6.0-SNAPSHOT.zip1highUnverifiable binary ZIP (OpenSearch plugin bundle) committed directly to the repository. Plugin bundles execute privileged code inside OpenSearch nodes. Supply chain risk — maintainers must verify provenance and integrity.
core/build.gradle66highAdds the unverifiable local JAR `libs/analytics-framework-3.6.0-SNAPSHOT.jar` as an `api` (transitive) compile dependency. Any malicious code in that artifact will be bundled and distributed with the plugin.
plugin/build.gradle164highAdds the unverifiable local JAR `libs/analytics-engine-3.6.0-SNAPSHOT.jar` as a `compileOnly` dependency. Classes from this artifact are referenced at compile time and resolved at runtime from the installed plugin bundle.
integ-test/build.gradle273highInstalls the unverifiable local `analytics-engine-3.6.0-SNAPSHOT.zip` as an OpenSearch plugin into all test clusters (integTest, yamlRestTest, remoteCluster, integTestWithSecurity, remoteIntegTestWithSecurity, integJdbcTest). Plugin code runs with full node privileges during CI.

The table above displays the top 10 most important findings.

Total: 6 | Critical: 0 | High: 6 | Medium: 0 | Low: 0


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

ahkcs added 4 commits April 1, 2026 15:11
Step 1: Plugin wiring and dependency integration.

- Add analytics-engine as extendedPlugins in plugin/build.gradle
- Vendor analytics-framework JAR (interfaces) and analytics-engine
  ZIP (plugin) built from OpenSearch sandbox/3.6 branch
- Delete local QueryPlanExecutor interface, use upstream
  org.opensearch.analytics.exec.QueryPlanExecutor from JAR
- Replace StubSchemaProvider with OpenSearchSchemaBuilder which reads
  real index mappings from ClusterState
- Delete StubSchemaProvider (no longer needed)
- Exclude shared JARs (Calcite, Guava, commons-*, etc.) from SQL
  plugin bundle to avoid jar hell with analytics-engine classloader
- Load analytics-engine plugin in integTest and remoteCluster test
  clusters before opensearch-sql-plugin
- Create parquet_logs and parquet_metrics indices in ITs so
  OpenSearchSchemaBuilder can resolve the schema
- Update explain expected files for alphabetical field ordering

Signed-off-by: Kai Huang <kaihuang@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Every test cluster that loads opensearch-sql-plugin needs the
analytics-engine plugin because SQL declares it as extendedPlugins.
Added to yamlRestTest, integTestWithSecurity, remoteIntegTestWithSecurity,
and integJdbcTest clusters.

Signed-off-by: Kai Huang <kaihuang@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <kaihuang@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
commons-text is needed by Calcite (parent classloader) for fuzzy matching
but was only in the SQL plugin (child classloader). Also use lightweight
parsing context in isAnalyticsIndex to avoid requiring cluster state.

Signed-off-by: Kai Huang <ahkcs@amazon.com>
@ahkcs ahkcs force-pushed the pr/mustang-plugin-wiring branch from 300fc86 to 62578cd Compare April 1, 2026 22:12
@ahkcs ahkcs added PPL Piped processing language enhancement New feature or request labels Apr 1, 2026
Calcite's EnumerableInterpretable.getBindable() hardcodes
EnumerableInterpretable.class.getClassLoader() for Janino compilation.
When analytics-engine is the parent classloader via extendedPlugins,
this returns the parent classloader which cannot see SQL plugin classes,
causing CompileException for any Enumerable code generation.

Override implement() in OpenSearchCalcitePreparingStmt to use our own
compileWithPluginClassLoader() which does the same code generation but
uses CalciteToolsHelper.class.getClassLoader() (SQL plugin's child
classloader) so Janino can resolve both parent and child classes.

Signed-off-by: Kai Huang <ahkcs@amazon.com>
@ahkcs ahkcs force-pushed the pr/mustang-plugin-wiring branch from ca8bcf1 to 2c4c401 Compare April 1, 2026 23:24
ahkcs added 2 commits April 2, 2026 00:25
The previous approach (analytics-engine as parent) caused Janino
classloader issues in 4 Calcite code paths. Reversing the relationship
makes SQL plugin the parent, so Janino uses the SQL plugin classloader
which can see both Calcite and SQL plugin classes.

Changes:
- Remove extendedPlugins=['analytics-engine'] from SQL plugin
- Add ExtensiblePlugin interface to SQLPlugin
- Rebuild analytics-engine ZIP with extended.plugins=opensearch-sql
  and without Calcite JARs (inherited from parent)
- Move OpenSearchSchemaBuilder to analytics-framework JAR
- Change analytics-framework from compileOnly to api in core
- Reverse test cluster plugin load order (SQL first)
- Revert all Janino classloader fixes (no longer needed)
- Add classloader architecture options doc

Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant