Skip to content

Unified error reporting for Android ExecuTorch JNI layers#18128

Open
psiddh wants to merge 15 commits intomainfrom
android_unified_error_reporting
Open

Unified error reporting for Android ExecuTorch JNI layers#18128
psiddh wants to merge 15 commits intomainfrom
android_unified_error_reporting

Conversation

@psiddh
Copy link
Contributor

@psiddh psiddh commented Mar 12, 2026

All four Android JNI modules (Generic, LLM, ASR, Training) previously used inconsistent exception types — a mix of RuntimeException, IllegalStateException, IllegalArgumentException, and the structured ExecutorchRuntimeException. This unifies them so every JNI error path throws ExecutorchRuntimeException with a proper error code, giving callers a single type to catch and programmatic access to the underlying runtime error.

Key changes per module:

LLM (jni_layer_llama.cpp) — generate() now captures the Error return from runner_->generate(), wraps the call in try/catch, reports failures via a new onError(errorCode, message) callback, and returns the actual error code instead of always returning 0.

ASR (jni_layer_asr.cpp) — replaced six env->ThrowNew(...) calls with setExecutorchPendingException (for pure JNI path)

Training (jni_layer_training.cpp) — added jni_helper.h include; replaced five throwNewJavaException("java/lang/Exception", ...) calls with throwExecutorchException, preserving the actual error codes from the failed operations.

Additionally: added default onError callbacks to LlmCallback (Java) and AsrCallback (Kotlin);

This PR was authored with the assistance of Claude.

cc @kirklandsign @cbilgin

@psiddh psiddh requested a review from kirklandsign as a code owner March 12, 2026 16:52
Copilot AI review requested due to automatic review settings March 12, 2026 16:52
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 12, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18128

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 55 Pending, 1 Unrelated Failure

As of commit 25b2fb8 with merge base b7ca1a4 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 12, 2026
@github-actions
Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to standardize Android ExecuTorch JNI error reporting by routing JNI failures through ExecutorchRuntimeException (with runtime error codes) and adds a couple of Android-side ergonomics improvements (LLM/ASR callbacks, log buffer behavior, and a RAM-vs-model-size helper).

Changes:

  • Unifies multiple JNI error paths to throw ExecutorchRuntimeException via jni_helper::throwExecutorchException(...).
  • Updates the Android in-memory log buffer to use a bounded std::deque (64 entries) for efficient eviction.
  • Adds error callbacks to LLM/ASR callback interfaces and introduces ExecuTorchRuntime.checkMemoryFit(...).

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
extension/android/jni/log.h Switches log buffer accessor signature from vector to deque.
extension/android/jni/log.cpp Implements bounded deque log buffer (64 entries) with pop_front().
extension/android/jni/jni_layer.cpp Updates log-buffer read path to consume a deque.
extension/android/jni/jni_layer_training.cpp Routes training JNI failures through throwExecutorchException(...) with real error codes.
extension/android/jni/jni_layer_llama.cpp Adds onError callback and returns actual Error from generate().
extension/android/jni/jni_layer_asr.cpp Replaces env->ThrowNew(...) sites with throwExecutorchException(...).
extension/android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmCallback.java Adds default onError(int, String) callback.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/extension/asr/AsrCallback.kt Adds onError(Int, String) with a default body.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/ExecuTorchRuntime.java Adds checkMemoryFit(Context, String) helper.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings March 12, 2026 17:24
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (2)

extension/android/jni/jni_layer_training.cpp:156

  • If execute_forward_backward() fails, throwExecutorchException is called but the function then continues to use result.get() to build the return array. Because throwExecutorchException may return without throwing, this can result in accessing an invalid Result. Return immediately after reporting the error (e.g., return an empty array / nullptr) to make the control flow safe and explicit.
    auto result =
        module_->execute_forward_backward(methodName->toStdString(), evalues);
    if (!result.ok()) {
      executorch::jni_helper::throwExecutorchException(
          static_cast<uint32_t>(result.error()),
          "Execution of forward_backward for method " + methodName->toStdString() + " failed");
    }

    facebook::jni::local_ref<facebook::jni::JArrayClass<JEValue>> jresult =
        facebook::jni::JArrayClass<JEValue>::newArray(result.get().size());

extension/android/jni/jni_layer_training.cpp:179

  • In namedParameters(), if module_->named_parameters() fails you call throwExecutorchException but then continue and iterate over result.get(). Add an early return right after throwExecutorchException (and apply the same pattern to namedGradients()) so you never touch result.get() on the error path.
    auto method = methodName->toStdString();
    auto result = module_->named_parameters(method);
    if (!result.ok()) {
      executorch::jni_helper::throwExecutorchException(
          static_cast<uint32_t>(result.error()),
          "Getting named parameters for method " + method + " failed");
    }
    facebook::jni::local_ref<
        facebook::jni::JHashMap<jstring, TensorHybrid::javaobject>>
        parameters = facebook::jni::
            JHashMap<jstring, TensorHybrid::javaobject>::create();
    for (auto& [layer, tensor] : result.get()) {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings March 12, 2026 17:33
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@psiddh psiddh force-pushed the android_unified_error_reporting branch 2 times, most recently from d04b07a to 1fe13a9 Compare March 13, 2026 03:08
Copilot AI review requested due to automatic review settings March 13, 2026 03:11
@psiddh psiddh force-pushed the android_unified_error_reporting branch from 1fe13a9 to 7d3d3ab Compare March 13, 2026 03:11
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@psiddh psiddh force-pushed the android_unified_error_reporting branch from 7d3d3ab to 469bb31 Compare March 13, 2026 03:16
@psiddh psiddh requested review from GregoryComer and Copilot March 13, 2026 03:17
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings March 13, 2026 05:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@psiddh psiddh force-pushed the android_unified_error_reporting branch from 764a6cc to df17065 Compare March 13, 2026 06:48
Standardize all Android JNI modules (LLM, ASR, Training) to throw
ExecutorchRuntimeException with structured error codes instead of a mix of
RuntimeException, IllegalStateException, and raw Exception.

- LLM generate() now captures Error returns, calls new onError callback, and
  returns actual error codes. Added else branch for unknown model types.
- ASR: replaced raw env->ThrowNew with setExecutorchPendingException (safe for
  plain JNIEXPORT functions that can't use fbjni's C++ exception mechanism).
- Training: replaced throwNewJavaException("java/lang/Exception") with
  throwExecutorchException using actual error codes. Added early returns after
  error throws to prevent dereferencing errored Results.
- Added setExecutorchPendingException to jni_helper for raw-JNI-safe exception
  setting (uses env->Throw instead of fbjni's C++ exception).
- Added onError callbacks to LlmCallback (Java) and AsrCallback (Kotlin).
- Added ExecuTorchRuntime.checkMemoryFit(Context, String) with null validation.

This PR was authored with the assistance of Claude.
Copilot AI review requested due to automatic review settings March 13, 2026 07:03
Copy link
Member

@GregoryComer GregoryComer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for cleaning this up. I don't have a ton of context on JNI error handling, but it looks reasonable to me.

@psiddh psiddh force-pushed the android_unified_error_reporting branch from c897541 to b4ed924 Compare March 16, 2026 17:17
Remove checkMemoryFit() from ExecuTorchRuntime.java — the naive
fileSize vs availMem comparison doesn't account for KV cache, load-time
peaks, or mmap behavior, so it's misleading. Can revisit with proper
benchmarking data.

In jni_layer_training.cpp, replace throw std::runtime_error(...) after
throwExecutorchException() with return/return{}. The fbjni-thrown
ExecutorchRuntimeException already carries the structured error code;
a redundant std::runtime_error would lose it if it propagated instead.
Also fix clang-format lint for SGDHybrid::registerNatives().

Co-authored-by: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 16, 2026 17:19
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 16, 2026 19:01
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings March 17, 2026 00:52
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 17, 2026 01:19
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 17, 2026 01:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 17, 2026 15:48
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: android Issues related to Android code, build, and execution

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

4 participants