-
Notifications
You must be signed in to change notification settings - Fork 62
Description
Summary
Add support for running ExecuTorch vision models (e.g. object detection) on local WebRTC camera frames in real-time, using react-native-webrtc's existing ProcessorProvider plugin system. This allows us to integrate with tools such as Fishjam.
Motivation
Users building WebRTC video call apps often want to run on-device ML on the local camera feed — object detection, pose estimation, segmentation — before or alongside sending the stream to a peer. There is currently no way to do this with react-native-executorch.
Approach
react-native-webrtc ships a plugin system for intercepting captured frames on both platforms:
iOS — implement VideoFrameProcessorDelegate:
- (RTCVideoFrame *)capturer:(RTCVideoCapturer *)capturer
didCaptureVideoFrame:(RTCVideoFrame *)frame;Android — implement VideoFrameProcessor:
VideoFrame process(VideoFrame frame, SurfaceTextureHelper textureHelper);On iOS, RTCVideoFrame wraps a CVPixelBufferRef — the exact same type our existing FrameExtractor.cpp already handles (same zero-copy path as VisionCamera). On Android, call toI420() and convert to RGB for the existing generateFromPixels path.