Refactor device selection: Rename to computePolicy, remove accelerated, and add fallback#923
Refactor device selection: Rename to computePolicy, remove accelerated, and add fallback#923mingmingtasd wants to merge 1 commit intowebmachinelearning:mainfrom
Conversation
| <dt>"<dfn enum-value>low-power</dfn>"</dt> | ||
| <dd>Prioritizes power consumption over other considerations such as execution speed.</dd> | ||
| <dt>"<dfn enum-value>fallback</dfn>"</dt> | ||
| <dd>Prioritizes maximum compatibility over other considerations, typically running on a CPU. This is useful for testing a model's numeric behavior without utilizing parallel accelerators like GPUs or NPUs.</dd> |
There was a problem hiding this comment.
This policy is useful for audio processing that doesn't want to depend on accelerators due to latency reasons. Can the text be updated to mention this case?
There was a problem hiding this comment.
(just naming) Hmm, reading the description of this enum ("Prioritizes maximum compatibility...", "useful for testing a model's numeric behavior..."), some more immediately enlightening names to readers would be something like "compatible"/"compatibility"/"stable"/"precision". I would have not guessed that description from the word "fallback", which implies you fell back to a less capable device than the one you really wanted, when actually CPU may have been exactly what you wanted (that is, it was the primary preference, not a fallback). e.g.
enum MLComputePolicy {
"default",
"high-performance",
"low-power",
"compatible", // or stable/local...
};This policy is useful for audio processing that doesn't want to depend on accelerators due to latency reasons
Totally, as I recall some teams in the past complaining about GPU overhead for background audio filtering in chat apps, preferring to keep compute more local on the CPU. Perhaps as a separate PR, we could add an explicit "low-latency" option too, which would be even clearer in intent for that scenario.
Also "precision" would be a useful preference, as some NPU's and GPU's chop off low bits (also, separate PR).
There was a problem hiding this comment.
+1 for both the "compatible" (in this PR), adding "low-latency" in a next PR, and adding a preference/hint for "precision" (to MLContextOptions?).
fdwr
left a comment
There was a problem hiding this comment.
Consider clearer naming, otherwise LGTM sir.
| <dt>"<dfn enum-value>low-power</dfn>"</dt> | ||
| <dd>Prioritizes power consumption over other considerations such as execution speed.</dd> | ||
| <dt>"<dfn enum-value>fallback</dfn>"</dt> | ||
| <dd>Prioritizes maximum compatibility over other considerations, typically running on a CPU. This is useful for testing a model's numeric behavior without utilizing parallel accelerators like GPUs or NPUs.</dd> |
There was a problem hiding this comment.
(just naming) Hmm, reading the description of this enum ("Prioritizes maximum compatibility...", "useful for testing a model's numeric behavior..."), some more immediately enlightening names to readers would be something like "compatible"/"compatibility"/"stable"/"precision". I would have not guessed that description from the word "fallback", which implies you fell back to a less capable device than the one you really wanted, when actually CPU may have been exactly what you wanted (that is, it was the primary preference, not a fallback). e.g.
enum MLComputePolicy {
"default",
"high-performance",
"low-power",
"compatible", // or stable/local...
};This policy is useful for audio processing that doesn't want to depend on accelerators due to latency reasons
Totally, as I recall some teams in the past complaining about GPU overhead for background audio filtering in chat apps, preferring to keep compute more local on the CPU. Perhaps as a separate PR, we could add an explicit "low-latency" option too, which would be even clearer in intent for that scenario.
Also "precision" would be a useful preference, as some NPU's and GPU's chop off low bits (also, separate PR).
To fix #911
Description:
This PR refactors the device selection preference API to establish a more extensible framework by replacing MLPowerPreference with MLComputePolicy in MLContextOptions.
Key changes included:
The corresponding chromium CL is https://chromium-review.googlesource.com/c/chromium/src/+/7513189
PTAL, thanks! @huningxin
Preview | Diff