Skip to content

Comments

NUTCH-3154 Implement integration testing framework for Nutch IndexWriter plugins using Testcontainers#895

Merged
lewismc merged 3 commits intoapache:masterfrom
lewismc:NUTCH-3154
Feb 21, 2026
Merged

NUTCH-3154 Implement integration testing framework for Nutch IndexWriter plugins using Testcontainers#895
lewismc merged 3 commits intoapache:masterfrom
lewismc:NUTCH-3154

Conversation

@lewismc
Copy link
Member

@lewismc lewismc commented Feb 13, 2026

This is a proof of concept for NUTCH-3154.
I really like this model as the @Testcontainers(disabledWithoutDocker = true) annotation will simply skip the test(s) if no local container environment is present. More info at https://java.testcontainers.org/quickstart/junit_5_quickstart/#4-additional-attributes

The new target is test-indexer-integration. This will be triggered in CI if changes to src/plugin/indexer-*/** are detected.

My proposal is as follows

indexer plugin action
indexer-cloudsearch DO NOT implement see Transition from Amazon CloudSearch to Amazon OpenSearch Service. Likely we should @Deprecate this plugin.
indexer-csv DO NOT implement, for testing only
indexer-dummy DO NOT implement, for testing only
indexer-elastic implement per this PR
indexer-kafka implement with Kafka module
indexer-opensearch-1x DO NOT implement. Likely we should @Deprecate this plugin.
indexer-rabbit implement with RabbitMQ module
indexer-solr implement with Solr module

Thanks for any peer review

@lewismc lewismc self-assigned this Feb 13, 2026
@lewismc
Copy link
Member Author

lewismc commented Feb 13, 2026

Looks like the ubuntu CI run was successful with the new target executing in 1m 19s.

@lewismc
Copy link
Member Author

lewismc commented Feb 13, 2026

I am happy to add the Solr, RabbitMQ and Kafka implementations to this PR. If we decide to @Deprecate cloudsearch and opensearch-1x plugins I would propose opening a new issue PR.

@lewismc
Copy link
Member Author

lewismc commented Feb 17, 2026

PR updated. i think this is a cleaner implementation. The proposed indexer integration tests reuse the protocol-test pattern (abstract base in core, plugins extend and plug in specifics), but use integration tests (real backends), a separate Ant target, and more configuration via the interface.
I also intentionally chose to NOT include integration tests in the existing ant test execution to retain existing behavior. The CI will take care of integration testing if indexer plugin changes are detected.

Copy link
Contributor

@sebastian-nagel sebastian-nagel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

That's an important contribution. Manually testing the indexer plugins which send docs to an indexer service was a burden.

I also intentionally chose to NOT include integration tests in the existing ant test execution to retain existing behavior.

Definitely. Also because the integration tests can take quite long - 9 minutes on my laptop.

The CI will take care of integration testing if indexer plugin changes are detected.

Perfect!

@lewismc lewismc merged commit 64ac8b4 into apache:master Feb 21, 2026
9 checks passed
@lewismc lewismc deleted the NUTCH-3154 branch February 21, 2026 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants