Skip to content

feat: Instance Segmentation API#893

Open
benITo47 wants to merge 53 commits intomainfrom
@bo/instanceSegmentation
Open

feat: Instance Segmentation API#893
benITo47 wants to merge 53 commits intomainfrom
@bo/instanceSegmentation

Conversation

@benITo47
Copy link
Contributor

@benITo47 benITo47 commented Mar 2, 2026

Description

This PR introduces Instance Segmentation Module. Provided API allows for using two predefined models (RF-DETR and YOLO26-Seg family) as well as plugging custom models through fromCustomConfig.

On top of adding new API, this PR adds common CV utilities on the CPP side, as well as migrates Object Detection to leverage them.

Introduces a breaking change?

  • Yes
  • No

Type of change

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Documentation update (improves or adds clarity to existing documentation)
  • Other (chores, tests, code style improvements etc.)

Tested on

  • iOS
  • Android

Testing instructions

  • Run test suite
  • Try new features by running demo app for Instance Segmentation
  • Confirm ObjectDetection works as expected by running respective demo apps

Screenshots

Related issues

#825

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings

Additional notes

@benITo47 benITo47 force-pushed the @bo/instanceSegmentation branch 2 times, most recently from 84a560c to bd23487 Compare March 10, 2026 10:40
@benITo47 benITo47 marked this pull request as ready for review March 11, 2026 10:56
@benITo47 benITo47 force-pushed the @bo/instanceSegmentation branch from 04e5cc2 to 1446d54 Compare March 11, 2026 11:05
@msluszniak

This comment was marked as resolved.

@msluszniak msluszniak assigned benITo47 and unassigned benITo47 Mar 11, 2026
@msluszniak msluszniak added feature PRs that implement a new feature labels Mar 11, 2026
@msluszniak msluszniak linked an issue Mar 11, 2026 that may be closed by this pull request
@benITo47 benITo47 changed the title @bo/instance segmentation feat: Instance Segmentation API Mar 11, 2026
@msluszniak
Copy link
Member

There are some broken links in documentation, can you fix them?

@benITo47 benITo47 marked this pull request as draft March 12, 2026 12:57
chmjkb
chmjkb previously requested changes Mar 16, 2026
@benITo47 benITo47 force-pushed the @bo/instanceSegmentation branch from 8557eba to a755a61 Compare March 16, 2026 12:17
@benITo47 benITo47 changed the base branch from main to @nk/vision-models-camera-integration March 16, 2026 12:18
@NorbertKlockiewicz NorbertKlockiewicz force-pushed the @nk/vision-models-camera-integration branch 2 times, most recently from b28847b to 836c7a2 Compare March 16, 2026 15:53
@benITo47 benITo47 force-pushed the @bo/instanceSegmentation branch from a755a61 to b2fa3be Compare March 16, 2026 16:12
@benITo47 benITo47 force-pushed the @bo/instanceSegmentation branch from c19137a to b722ff9 Compare March 17, 2026 10:08
@benITo47 benITo47 force-pushed the @bo/instanceSegmentation branch from b722ff9 to 0e90871 Compare March 17, 2026 10:24
@benITo47 benITo47 requested a review from mkopcins March 17, 2026 13:43
@benITo47 benITo47 dismissed chmjkb’s stale review March 17, 2026 13:44

JC has time off and is out of office

@benITo47 benITo47 requested a review from msluszniak March 17, 2026 13:44
@msluszniak
Copy link
Member

I will look at this PR in a minute

@benITo47 benITo47 force-pushed the @bo/instanceSegmentation branch from 55e4265 to 3dd47a5 Compare March 17, 2026 14:21
@benITo47 benITo47 force-pushed the @bo/instanceSegmentation branch from 68d259b to c6b21de Compare March 17, 2026 14:32
@benITo47 benITo47 force-pushed the @bo/instanceSegmentation branch from c6b21de to e4d7a61 Compare March 17, 2026 14:34
return VisionModel::modelInputSize();
}
const auto &shape = inputShapes[0];
return cv::Size(shape[shape.size() - 2], shape[shape.size() - 1]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return cv::Size(shape[shape.size() - 2], shape[shape.size() - 1]);
return {shape[shape.size() - 2], shape[shape.size() - 1]};

Comment on lines +124 to +135
std::tuple<utils::computer_vision::BBox, float, int32_t>
BaseInstanceSegmentation::extractDetectionData(const float *bboxData,
const float *scoresData,
int32_t index) {
utils::computer_vision::BBox bbox{
bboxData[index * 4], bboxData[index * 4 + 1], bboxData[index * 4 + 2],
bboxData[index * 4 + 3]};
float score = scoresData[index * 2];
int32_t label = static_cast<int32_t>(scoresData[index * 2 + 1]);

return {bbox, score, label};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we pass tuple here, these should be 3 separate function, each fetching different type of data.

int32_t my2 =
std::min(maskSize.height, static_cast<int32_t>(std::ceil(my2F)));

return cv::Rect(mx1, my1, mx2 - mx1, my2 - my1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return cv::Rect(mx1, my1, mx2 - mx1, my2 - my1);
return {mx1, my1, mx2 - mx1, my2 - my1};

int32_t x2 = std::min(maskSize.width, rect.x + rect.width + 1);
int32_t y2 = std::min(maskSize.height, rect.y + rect.height + 1);

return cv::Rect(x1, y1, x2 - x1, y2 - y1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return cv::Rect(x1, y1, x2 - x1, y2 - y1);
return {x1, y1, x2 - x1, y2 - y1};

Comment on lines +50 to +54
TEST(InstanceSegGenerateTests, InvalidImagePathThrows) {
BaseInstanceSegmentation model(kValidInstanceSegModelPath, {}, {}, true,
nullptr);
EXPECT_THROW((void)model.generateFromString("nonexistent_image.jpg", 0.5, 0.5,
100, {}, true, kMethodName),
Copy link
Member

@msluszniak msluszniak Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are not triggered by the script test

also I get some errors which doesn't occured earlier:

/Users/msluszniak/test_bare_rn/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/tests/integration/TextToSpeechTest.cpp:77: Failure
Expected: Kokoro(kValidLang, kValidTaggerPath, kValidPhonemizerPath, kValidDurationPath, kValidSynthesizerPath, "nonexistent_voice.bin", nullptr) throws an exception of type RnExecutorchError.
Actual: it throws std::invalid_argument with description "File not found: kokoro_en_tagger.json".

[ FAILED ] TTSCtorTests.InvalidVoicePathThrows (0 ms)

This one was fixed with LLM & TTS integration

/Users/msluszniak/test_bare_rn/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/tests/integration/BaseModelTests.h:121: Failure
Type parameterized test suite CommonModelTest is defined via REGISTER_TYPED_TEST_SUITE_P, but never instantiated via INSTANTIATE_TYPED_TEST_SUITE_P. None of the test cases will run.

Ideally, TYPED_TEST_P definitions should only ever be included as part of binaries that intend to use them. (As opposed to, for example, being placed in a library that may be linked in to get other utilities.)

To suppress this error for this test suite, insert the following line (in a non-header) in the namespace it is defined in:

GTEST_ALLOW_UNINSTANTIATED_PARAMETERIZED_TEST(CommonModelTest);

[ FAILED ] GoogleTestVerification.UninstantiatedTypeParameterizedTestSuite (0 ms)

/Users/msluszniak/test_bare_rn/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/tests/integration/BaseModelTests.h:108: Failure
Expected: Traits::callGenerate(model) doesn't throw an exception.
Actual: it throws rnexecutorch::RnExecutorchError with description "The model's forward function did not succeed. Ensure the model input is correct.".

/Users/msluszniak/test_bare_rn/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/tests/integration/BaseModelTests.h:109: Failure
Expected: Traits::callGenerate(model) doesn't throw an exception.
Actual: it throws rnexecutorch::RnExecutorchError with description "The model's forward function did not succeed. Ensure the model input is correct.".

/Users/msluszniak/test_bare_rn/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/tests/integration/BaseModelTests.h:110: Failure
Expected: Traits::callGenerate(model) doesn't throw an exception.
Actual: it throws rnexecutorch::RnExecutorchError with description "The model's forward function did not succeed. Ensure the model input is correct.".

[ FAILED ] TextToImage/CommonModelTest/0.MultipleGeneratesWork, where TypeParam = rnexecutorch::models::text_to_image::TextToImage (4981 ms)

@@ -0,0 +1,277 @@
import Spinner from '../../components/Spinner';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some thing to fix in demo app:

  1. Why there is object detection live when there is a separate vision camera screen, also when clicking on this, I got unmatched route.
  2. Why instance and semantic segmentation are not next to each other on the list below
  3. There is no way to get back from vision camera example, I missed it in previous review, cc: @NorbertKlockiewicz

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. You can just swipe from the left side

Copy link
Member

@msluszniak msluszniak Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NorbertKlockiewicz I don't thing that is good enough. If I had a problem and haven't thought about that swipe, then potentially many more users might have the same problem (there is no intuition that there might be something here). That's my opinion.

@msluszniak
Copy link
Member

Also we should fix the format how tags are aligned to the bounding box. Below there is an example of it shouldn't look like. If you don't want to fix this altogether with this PR, please make a separate issue please.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature PRs that implement a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Instance Segmentation API

5 participants