Skip to content

feat: extend deeplinks for recording control + Raycast extension#1683

Open
AliaksandrNazaruk wants to merge 2 commits intoCapSoftware:mainfrom
AliaksandrNazaruk:feat/deeplinks-raycast
Open

feat: extend deeplinks for recording control + Raycast extension#1683
AliaksandrNazaruk wants to merge 2 commits intoCapSoftware:mainfrom
AliaksandrNazaruk:feat/deeplinks-raycast

Conversation

@AliaksandrNazaruk
Copy link
Copy Markdown

@AliaksandrNazaruk AliaksandrNazaruk commented Mar 26, 2026

Summary

Extends Cap's deeplink support with full recording control actions and adds a Raycast extension.

New Deeplink Actions

Action Description
pause_recording Pause current recording
resume_recording Resume paused recording
toggle_pause_recording Toggle pause/resume
restart_recording Restart current recording
take_screenshot Take a screenshot (with capture mode)

These complement the existing start_recording, stop_recording, open_editor, and open_settings actions.

Deeplink Protocol

Unit variants (no parameters):

cap-desktop://action?value="stop_recording"
cap-desktop://action?value="pause_recording"
cap-desktop://action?value="resume_recording"
cap-desktop://action?value="toggle_pause_recording"
cap-desktop://action?value="restart_recording"

Struct variants (with parameters):

cap-desktop://action?value={"take_screenshot":{"capture_mode":{"screen":"Main Display"}}}

Raycast Extension (extensions/raycast/)

8 commands for full Cap control from Raycast:

  • Start / Stop / Pause / Resume / Toggle Pause / Restart recording
  • Take Screenshot
  • Open Settings

All commands use the cap-desktop:// deeplink protocol.

Changes

  • apps/desktop/src-tauri/src/deeplink_actions.rs: Added 5 new DeepLinkAction variants with their execute implementations
  • extensions/raycast/: New Raycast extension with TypeScript commands

Closes #1540

Greptile Summary

This PR extends Cap's deeplink protocol with five new recording-control actions (PauseRecording, ResumeRecording, TogglePauseRecording, RestartRecording, TakeScreenshot) and ships a new Raycast extension that exposes all of them as quick commands. The Rust-side additions are straightforward and correctly delegate to the existing recording subsystem.\n\nKey issues to address before merging:\n\n- Missing icon file (extensions/raycast/package.json): package.json declares \"icon\": \"command-icon.png\" but no such file exists in the commit. ray build / ray develop will fail until the PNG is added.\n- Hardcoded \"Main Display\" capture name (start-recording.ts, take-screenshot.ts): macOS display names are model strings (e.g. \"Built-in Retina Display\"), not \"Main Display\". When no display matches, the backend silently drops the action. Both commands need a real fallback strategy.\n- Duplicated capture-target resolution in deeplink_actions.rs: The CaptureModeScreenCaptureTarget block is copied verbatim between StartRecording and TakeScreenshot; a small private helper would remove the duplication.\n- HUD reports success unconditionally (utils.ts): showHUD fires after open(url) regardless of whether Cap is running or the action succeeded; rewording the messages to say "Sending…" instead of confirming completion would set more accurate expectations.

Confidence Score: 3/5

Not safe to merge as-is: the missing icon file blocks the extension build, and the hardcoded "Main Display" string silently breaks the two most visible commands for most users.

The Rust backend changes are solid and correctly wired up. However, the Raycast extension has a build-time blocker (missing icon) and a user-facing runtime failure (hardcoded display name) that will silently affect every user whose primary display isn't literally named "Main Display". These are concrete, reproducible failures on the primary feature path.

extensions/raycast/package.json (missing icon), extensions/raycast/src/start-recording.ts and take-screenshot.ts (hardcoded display name)

Important Files Changed

Filename Overview
apps/desktop/src-tauri/src/deeplink_actions.rs Adds 5 new DeepLinkAction variants (PauseRecording, ResumeRecording, TogglePauseRecording, RestartRecording, TakeScreenshot); logic is correct but capture-target resolution is duplicated verbatim from StartRecording.
extensions/raycast/package.json Declares extension metadata and 8 commands correctly, but references a missing icon file (command-icon.png) that will block the build.
extensions/raycast/src/utils.ts Clean deeplink builder that correctly JSON-serialises both unit and struct enum variants; HUD always shows success regardless of whether Cap is running.
extensions/raycast/src/start-recording.ts Hardcodes capture_mode to "Main Display", which is not a valid macOS display name for most machines — will silently fail at the backend.
extensions/raycast/src/take-screenshot.ts Same hardcoded "Main Display" issue as start-recording.ts; will silently fail when no display has that exact name.

Sequence Diagram

sequenceDiagram
    participant User
    participant Raycast
    participant macOS as macOS URL Scheme
    participant Tauri as Cap Desktop (Tauri)
    participant Recording as recording.rs

    User->>Raycast: Invoke command (e.g. Pause Recording)
    Raycast->>Raycast: buildDeepLink action to cap-desktop:// URL
    Raycast->>macOS: open(url)
    Raycast->>Raycast: showHUD message (unconditional)
    macOS->>Tauri: Deliver deeplink URL
    Tauri->>Tauri: DeepLinkAction::try_from - serde_json parse
    Tauri->>Recording: pause_recording(app, state)
    Recording-->>Tauri: Ok or Err
    Note over Tauri: Errors only printed to stderr
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: extensions/raycast/package.json
Line: 7

Comment:
**Missing icon file**

`package.json` declares `"icon": "command-icon.png"` but no such file exists anywhere in the `extensions/raycast/` directory (confirmed by inspecting the full commit tree). Raycast validates the icon at build time, so `ray build` and `ray develop` will fail with a missing-asset error until this PNG is added to the extension root.

You need to add a `command-icon.png` (typically 512×512 px) to `extensions/raycast/` before this extension can be built or published.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/raycast/src/start-recording.ts
Line: 7

Comment:
**Hardcoded display name will fail for most users**

`"Main Display"` is not a real macOS display name. macOS reports display names as the monitor model (e.g. `"Built-in Retina Display"`, `"LG Ultra HD"`). The backend resolves the target by doing an exact string match:

```rust
.find(|(s, _)| s.name == name)
.ok_or(format!("No screen with name \"{}\"", &name))?
```

If no display has that name the deeplink handler returns an error, which is only printed to `stderr` — the user sees no feedback and the command silently does nothing.

The same issue applies to `take-screenshot.ts` (line 7):
```
extensions/raycast/src/take-screenshot.ts:7
```

Consider either:
- Querying available displays first and letting the user pick (requires a view-mode command), or
- Documenting that users must edit these files with their actual display name, or
- Using index-based selection instead of name-based lookup.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: apps/desktop/src-tauri/src/deeplink_actions.rs
Line: 171-188

Comment:
**Duplicated capture target resolution logic**

The `CaptureMode``ScreenCaptureTarget` mapping (lines 172–183) is an exact copy of the same block already present in the `StartRecording` arm (lines 130–141). If the display/window lookup logic ever changes (error messages, fallback behaviour, etc.) both copies would need to be updated.

Consider extracting it into a small helper function:

```rust
fn resolve_capture_target(capture_mode: CaptureMode) -> Result<ScreenCaptureTarget, String> {
    match capture_mode {
        CaptureMode::Screen(name) => cap_recording::screen_capture::list_displays()
            .into_iter()
            .find(|(s, _)| s.name == name)
            .map(|(s, _)| ScreenCaptureTarget::Display { id: s.id })
            .ok_or(format!("No screen with name \"{}\"", &name)),
        CaptureMode::Window(name) => cap_recording::screen_capture::list_windows()
            .into_iter()
            .find(|(w, _)| w.name == name)
            .map(|(w, _)| ScreenCaptureTarget::Window { id: w.id })
            .ok_or(format!("No window with name \"{}\"", &name)),
    }
}
```

Then both arms become `let capture_target = resolve_capture_target(capture_mode)?;`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/raycast/src/utils.ts
Line: 27-29

Comment:
**HUD shows success regardless of whether Cap is running**

`await open(url)` launches the URL scheme handler but returns before the Cap process actually processes — or even receives — the deeplink. If Cap is not running, the user still sees "⏸ Pausing Cap recording…" (or similar), making it impossible to distinguish a successful action from a no-op.

One lightweight improvement: show the HUD *before* the `open` call (so it acts as an "attempting" message rather than a "done" message) and clarify the wording. For example, "Sending pause command to Cap…" sets more realistic expectations than "Pausing Cap recording…".

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "feat: extend deeplinks for recording con..." | Re-trigger Greptile

Greptile also left 4 inline comments on this PR.

(2/5) Greptile learns from your feedback when you react with thumbs up/down!

Add new deeplink actions for full recording control:
- PauseRecording
- ResumeRecording
- TogglePauseRecording
- RestartRecording
- TakeScreenshot (with capture mode)

Build Raycast extension (extensions/raycast/) with commands for:
- Start/Stop/Pause/Resume/Toggle pause recording
- Restart recording
- Take screenshot
- Open settings

All commands use the cap-desktop:// deeplink protocol.

Closes CapSoftware#1540
"title": "Cap",
"description": "Control Cap screen recorder — start/stop recording, pause, resume, take screenshots, and more.",
"icon": "command-icon.png",
"author": "cap",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Missing icon file

package.json declares "icon": "command-icon.png" but no such file exists anywhere in the extensions/raycast/ directory (confirmed by inspecting the full commit tree). Raycast validates the icon at build time, so ray build and ray develop will fail with a missing-asset error until this PNG is added to the extension root.

You need to add a command-icon.png (typically 512×512 px) to extensions/raycast/ before this extension can be built or published.

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/raycast/package.json
Line: 7

Comment:
**Missing icon file**

`package.json` declares `"icon": "command-icon.png"` but no such file exists anywhere in the `extensions/raycast/` directory (confirmed by inspecting the full commit tree). Raycast validates the icon at build time, so `ray build` and `ray develop` will fail with a missing-asset error until this PNG is added to the extension root.

You need to add a `command-icon.png` (typically 512×512 px) to `extensions/raycast/` before this extension can be built or published.

How can I resolve this? If you propose a fix, please make it concise.

await triggerDeepLink(
{
start_recording: {
capture_mode: { screen: "Main Display" },
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Hardcoded display name will fail for most users

"Main Display" is not a real macOS display name. macOS reports display names as the monitor model (e.g. "Built-in Retina Display", "LG Ultra HD"). The backend resolves the target by doing an exact string match:

.find(|(s, _)| s.name == name)
.ok_or(format!("No screen with name \"{}\"", &name))?

If no display has that name the deeplink handler returns an error, which is only printed to stderr — the user sees no feedback and the command silently does nothing.

The same issue applies to take-screenshot.ts (line 7):

extensions/raycast/src/take-screenshot.ts:7

Consider either:

  • Querying available displays first and letting the user pick (requires a view-mode command), or
  • Documenting that users must edit these files with their actual display name, or
  • Using index-based selection instead of name-based lookup.
Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/raycast/src/start-recording.ts
Line: 7

Comment:
**Hardcoded display name will fail for most users**

`"Main Display"` is not a real macOS display name. macOS reports display names as the monitor model (e.g. `"Built-in Retina Display"`, `"LG Ultra HD"`). The backend resolves the target by doing an exact string match:

```rust
.find(|(s, _)| s.name == name)
.ok_or(format!("No screen with name \"{}\"", &name))?
```

If no display has that name the deeplink handler returns an error, which is only printed to `stderr` — the user sees no feedback and the command silently does nothing.

The same issue applies to `take-screenshot.ts` (line 7):
```
extensions/raycast/src/take-screenshot.ts:7
```

Consider either:
- Querying available displays first and letting the user pick (requires a view-mode command), or
- Documenting that users must edit these files with their actual display name, or
- Using index-based selection instead of name-based lookup.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +171 to +188
DeepLinkAction::TakeScreenshot { capture_mode } => {
let capture_target: ScreenCaptureTarget = match capture_mode {
CaptureMode::Screen(name) => cap_recording::screen_capture::list_displays()
.into_iter()
.find(|(s, _)| s.name == name)
.map(|(s, _)| ScreenCaptureTarget::Display { id: s.id })
.ok_or(format!("No screen with name \"{}\"", &name))?,
CaptureMode::Window(name) => cap_recording::screen_capture::list_windows()
.into_iter()
.find(|(w, _)| w.name == name)
.map(|(w, _)| ScreenCaptureTarget::Window { id: w.id })
.ok_or(format!("No window with name \"{}\"", &name))?,
};

crate::recording::take_screenshot(app.clone(), capture_target)
.await
.map(|_| ())
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Duplicated capture target resolution logic

The CaptureModeScreenCaptureTarget mapping (lines 172–183) is an exact copy of the same block already present in the StartRecording arm (lines 130–141). If the display/window lookup logic ever changes (error messages, fallback behaviour, etc.) both copies would need to be updated.

Consider extracting it into a small helper function:

fn resolve_capture_target(capture_mode: CaptureMode) -> Result<ScreenCaptureTarget, String> {
    match capture_mode {
        CaptureMode::Screen(name) => cap_recording::screen_capture::list_displays()
            .into_iter()
            .find(|(s, _)| s.name == name)
            .map(|(s, _)| ScreenCaptureTarget::Display { id: s.id })
            .ok_or(format!("No screen with name \"{}\"", &name)),
        CaptureMode::Window(name) => cap_recording::screen_capture::list_windows()
            .into_iter()
            .find(|(w, _)| w.name == name)
            .map(|(w, _)| ScreenCaptureTarget::Window { id: w.id })
            .ok_or(format!("No window with name \"{}\"", &name)),
    }
}

Then both arms become let capture_target = resolve_capture_target(capture_mode)?;.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src-tauri/src/deeplink_actions.rs
Line: 171-188

Comment:
**Duplicated capture target resolution logic**

The `CaptureMode``ScreenCaptureTarget` mapping (lines 172–183) is an exact copy of the same block already present in the `StartRecording` arm (lines 130–141). If the display/window lookup logic ever changes (error messages, fallback behaviour, etc.) both copies would need to be updated.

Consider extracting it into a small helper function:

```rust
fn resolve_capture_target(capture_mode: CaptureMode) -> Result<ScreenCaptureTarget, String> {
    match capture_mode {
        CaptureMode::Screen(name) => cap_recording::screen_capture::list_displays()
            .into_iter()
            .find(|(s, _)| s.name == name)
            .map(|(s, _)| ScreenCaptureTarget::Display { id: s.id })
            .ok_or(format!("No screen with name \"{}\"", &name)),
        CaptureMode::Window(name) => cap_recording::screen_capture::list_windows()
            .into_iter()
            .find(|(w, _)| w.name == name)
            .map(|(w, _)| ScreenCaptureTarget::Window { id: w.id })
            .ok_or(format!("No window with name \"{}\"", &name)),
    }
}
```

Then both arms become `let capture_target = resolve_capture_target(capture_mode)?;`.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +27 to +29
const url = buildDeepLink(action);
await open(url);
await showHUD(hudMessage);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 HUD shows success regardless of whether Cap is running

await open(url) launches the URL scheme handler but returns before the Cap process actually processes — or even receives — the deeplink. If Cap is not running, the user still sees "⏸ Pausing Cap recording…" (or similar), making it impossible to distinguish a successful action from a no-op.

One lightweight improvement: show the HUD before the open call (so it acts as an "attempting" message rather than a "done" message) and clarify the wording. For example, "Sending pause command to Cap…" sets more realistic expectations than "Pausing Cap recording…".

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/raycast/src/utils.ts
Line: 27-29

Comment:
**HUD shows success regardless of whether Cap is running**

`await open(url)` launches the URL scheme handler but returns before the Cap process actually processes — or even receives — the deeplink. If Cap is not running, the user still sees "⏸ Pausing Cap recording…" (or similar), making it impossible to distinguish a successful action from a no-op.

One lightweight improvement: show the HUD *before* the `open` call (so it acts as an "attempting" message rather than a "done" message) and clarify the wording. For example, "Sending pause command to Cap…" sets more realistic expectations than "Pausing Cap recording…".

How can I resolve this? If you propose a fix, please make it concise.

- Add icon.png placeholder (was referencing missing command-icon.png)
- Use default display instead of hardcoded 'Main Display'
- Extract duplicated capture target resolution into CaptureMode::resolve()
- Show error HUD when Cap is not running instead of false success
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bounty: Deeplinks support + Raycast Extension

2 participants