Speech to speech: Mute functionality#5688
Conversation
…BotFramework-WebChat into feature/s2s-mute
packages/api/src/providers/SpeechToSpeech/private/VoiceRecorderBridge.tsx
Outdated
Show resolved
Hide resolved
packages/api/src/providers/SpeechToSpeech/private/VoiceRecorderBridge.tsx
Show resolved
Hide resolved
packages/api/src/providers/SpeechToSpeech/private/useRecorder.ts
Outdated
Show resolved
Hide resolved
packages/api/src/providers/SpeechToSpeech/private/useRecorder.ts
Outdated
Show resolved
Hide resolved
packages/api/src/providers/SpeechToSpeech/private/useRecorder.ts
Outdated
Show resolved
Hide resolved
done |
| bytes[i] = binaryString.charCodeAt(i); | ||
| } | ||
| return bytes.some(byte => byte !== 0); | ||
| } |
There was a problem hiding this comment.
Aren't they simple negation of each other? 🤣
There was a problem hiding this comment.
It's okay to work on it when we touch it.
compulim
left a comment
There was a problem hiding this comment.
Nice work, love the code, well done.
There was a problem hiding this comment.
Pull request overview
Adds core (non-UI) mute/unmute support for Speech-to-Speech recording by introducing a muted voice state and keeping the server stream alive via silent audio chunks while the physical microphone is stopped.
Changes:
- Add
VOICE_MUTE_RECORDING/VOICE_UNMUTE_RECORDINGactions and reducer handling, plusmutedinVoiceState. - Extend S2S recorder/worklet to support
MUTE/UNMUTEand generate silent frames while muted; wire it viaVoiceRecorderBridge. - Expose a consumer-facing hook
useVoiceRecordingMuted, export it through component/bundle entrypoints, and add unit + HTML harness tests.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/core/src/reducers/voiceActivity.ts | Adds mute/unmute action handling and transitions voice state to/from muted. |
| packages/core/src/index.ts | Exports new mute/unmute actions (and reorders related voice exports). |
| packages/core/src/actions/muteVoiceRecording.ts | New Redux action creator/constants for muting voice recording. |
| packages/core/src/actions/unmuteVoiceRecording.ts | New Redux action creator/constants for unmuting voice recording. |
| packages/core/src/actions/setVoiceState.ts | Extends VoiceState union to include muted. |
| packages/api/src/providers/SpeechToSpeech/private/useRecorder.ts | Adds AudioWorklet mute logic + recorder mute/unmute flow with silent chunks. |
| packages/api/src/providers/SpeechToSpeech/private/VoiceRecorderBridge.tsx | Bridges voiceState === 'muted' to recorder mute/unmute behavior. |
| packages/api/src/providers/SpeechToSpeech/private/useRecorder.spec.tsx | Adds unit tests for mute/unmute behavior and commands. |
| packages/api/src/hooks/useVoiceRecordingMuted.ts | New public hook for consumers to read/set muted state via Redux actions. |
| packages/api/src/hooks/index.ts | Exports useVoiceRecordingMuted. |
| packages/api/src/boot/hook.ts | Re-exports useVoiceRecordingMuted from the boot hook entry. |
| packages/component/src/boot/hook.ts | Surfaces useVoiceRecordingMuted through component hook entry. |
| packages/bundle/src/boot/actual/hook/minimal.ts | Surfaces useVoiceRecordingMuted through minimal bundle hook entry. |
| tests/assets/esm/speechToSpeech/mockMediaDevices.js | Updates test audio mock to support muted vs non-muted chunk generation. |
| tests/html2/speechToSpeech/mute.unmute.html | Adds HTML harness test validating idle/listening/muted transitions and chunk content. |
| CHANGELOG.md | Adds changelog entry for S2S mute/unmute functionality. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| case VOICE_UNMUTE_RECORDING: | ||
| if (state.voiceState !== 'muted') { | ||
| console.warn(`botframework-webchat: Should not transit from "${state.voiceState}" to "listening"`); | ||
| } | ||
|
|
||
| return { | ||
| ...state, | ||
| voiceState: 'listening' | ||
| }; |
There was a problem hiding this comment.
VOICE_UNMUTE_RECORDING currently forces voiceState: 'listening' even when the current state is not muted (it only logs a warning). This means calling useVoiceRecordingMuted()[1](false) while voiceState is idle (or any other state) will incorrectly transition into listening. Consider returning state (no-op) unless state.voiceState === 'muted', similar to the guard used for VOICE_MUTE_RECORDING.
| console.warn(`botframework-webchat: Should not transit from "${state.voiceState}" to "listening"`); | ||
| } |
There was a problem hiding this comment.
Console warning message uses "transit" (e.g., "Should not transit...") which reads like a typo/grammar issue in a user-visible log message. Consider changing it to "transition" for clarity and consistency.
| // Restart MediaStream and reconnect source (fire and forget) | ||
| acquireAndConnectMediaStream(); |
There was a problem hiding this comment.
The unmute closure calls acquireAndConnectMediaStream() without awaiting or handling errors. If getUserMedia() rejects, this becomes an unhandled promise rejection; and if recording stops while the promise is in-flight, the resolved stream can be assigned to streamRef without being connected/stopped, potentially turning the mic indicator back on while not recording. Consider making the unmute path async (or explicitly catch) and adding a cancellation/guard so newly acquired streams are immediately stopped if the worklet/recording has been torn down.
| // Restart MediaStream and reconnect source (fire and forget) | |
| acquireAndConnectMediaStream(); | |
| // Restart MediaStream and reconnect source (handle errors and teardown race) | |
| (async () => { | |
| try { | |
| await acquireAndConnectMediaStream(); | |
| } catch { | |
| // Swallow errors to avoid unhandled promise rejections. | |
| // Optionally log if needed. | |
| } finally { | |
| // If recording/worklet have been torn down while acquiring the stream, | |
| // immediately stop any newly acquired media stream to avoid leaving | |
| // the mic active when not recording. | |
| if (!audioCtxRef.current || !workletRef.current) { | |
| stopMediaStream(); | |
| } | |
| } | |
| })(); |
| const audioData = new Float32Array(bufferSize); | ||
| if (!muted) { | ||
| for (let i = 0; i < bufferSize; i++) { | ||
| audioData[+i] = Math.sin(i * 0.1) * 0.5; |
There was a problem hiding this comment.
In the mock sine-wave generator, audioData[+i] is equivalent to audioData[i] but the unary + is unnecessary and makes the intent harder to read. Consider using audioData[i] here.
| audioData[+i] = Math.sin(i * 0.1) * 0.5; | |
| audioData[i] = Math.sin(i * 0.1) * 0.5; |
| - 👷🏻 Added `npm run build-browser` script for building test harness package only, in PR [#5667](https://github.com/microsoft/BotFramework-WebChat/pull/5667), by [@compulim](https://github.com/compulim) | ||
| - Added pull-based capabilities system for dynamically discovering adapter capabilities at runtime, in PR [#5679](https://github.com/microsoft/BotFramework-WebChat/pull/5679), by [@pranavjoshi001](https://github.com/pranavjoshi001) | ||
| - Added Speech-to-Speech (S2S) support for real-time voice conversations, in PR [#5654](https://github.com/microsoft/BotFramework-WebChat/pull/5654), by [@pranavjoshi](https://github.com/pranavjoshi001) | ||
| - Added core mute/unmute functionality for speech-to-speech via `useRecorder` hook (silent chunks keep server connection alive), in PR [#5688](https://github.com/microsoft/BotFramework-WebChat/pull/5688), by [@pranavjoshi](https://github.com/pranavjoshi001) |
There was a problem hiding this comment.
Changelog entry says mute/unmute was added "via useRecorder hook", but useRecorder is an internal/private implementation detail under providers/SpeechToSpeech/private. For consumers, the new public surface appears to be useVoiceRecordingMuted (and/or the muteVoiceRecording/unmuteVoiceRecording actions). Consider rewording this entry to reference the public API to avoid confusing integrators.
| - Added core mute/unmute functionality for speech-to-speech via `useRecorder` hook (silent chunks keep server connection alive), in PR [#5688](https://github.com/microsoft/BotFramework-WebChat/pull/5688), by [@pranavjoshi](https://github.com/pranavjoshi001) | |
| - Added core mute/unmute functionality for speech-to-speech via `useVoiceRecordingMuted` hook and `muteVoiceRecording` / `unmuteVoiceRecording` actions (silent chunks keep server connection alive), in PR [#5688](https://github.com/microsoft/BotFramework-WebChat/pull/5688), by [@pranavjoshi](https://github.com/pranavjoshi001) |
Changelog Entry
Description
This PR adds mute/unmute functionality for the Speech-to-Speech (S2S) feature as core API only, without UI changes. When muted, the microphone is turned off (browser indicator disappears) but silent audio chunks continue to be sent to keep the server connection alive. This prevents connection timeouts while allowing consumers to implement their own mute UI.
Design
The mute functionality works at multiple levels:
AudioWorklet Level:
MUTEandUNMUTEcommandsuseRecorder Hook (
useRecorder.ts):mute()function that:MUTEcommand to the workletunmute()function that:UNMUTEcommand to the workletgetUserMediaVoiceRecorderBridge (
VoiceRecorderBridge.tsx):mutefunction to the voice state machinemuted, callsmute()and stores theunmutefunctionlistening, calls the storedunmutefunctionRedux Actions & Hooks:
muteVoiceRecordingandunmuteVoiceRecordingRedux actionsuseVoiceRecordingMutedhook for consumers that gives value and setter function.Specific Changes
MUTEandUNMUTEcommand handling in AudioWorklet processor to generate silent chunks when mutedmutefunction touseRecorder.tshook that disconnects audio and stops MediaStream while continuing to send silent chunksVoiceRecorderBridge.tsxto handle mute/unmute based on voice state changesmuteVoiceRecording.tsandunmuteVoiceRecording.tsRedux actionsvoiceActivity.tsreducer to handleVOICE_MUTE_RECORDINGandVOICE_UNMUTE_RECORDINGactionsuseRecorder.spec.tsxCHANGELOG.mdReview Checklist
z-index)package.jsonandpackage-lock.jsonreviewed