Skip to content

feat: Prevent accidental re-onboarding of customer site which was onboarded earlier with paid profile#2007

Open
tkotthakota-adobe wants to merge 12 commits intomainfrom
prod-profile
Open

feat: Prevent accidental re-onboarding of customer site which was onboarded earlier with paid profile#2007
tkotthakota-adobe wants to merge 12 commits intomainfrom
prod-profile

Conversation

@tkotthakota-adobe
Copy link
Contributor

@tkotthakota-adobe tkotthakota-adobe commented Mar 20, 2026

https://jira.corp.adobe.com/browse/SITES-42173

Adds a guard to onboard site command that blocks re-onboarding a paid-profile site with a lower-tier profile to avoid unintentional config changes.

  • Single site onboarding: An optional Force Onboard checkbox is added to the onboard modal. When checked, it bypasses the guard allowing intentional re-onboarding with a lower-tier profile.

  • CSV batch onboarding: The guard runs per-site inside the batch loop. Each row is evaluated independently — if a site in the CSV was previously onboarded with a paid profile (paid) and the CSV row specifies a lower-tier profile (e.g. demo, test), that site is skipped with a Slack warning. All other rows in the batch are unaffected and continue to process normally.

Related PR: adobe/spacecat-shared#1464

@github-actions
Copy link

This PR will trigger no release when merged.

@codecov
Copy link

codecov bot commented Mar 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@tkotthakota-adobe tkotthakota-adobe changed the title feat: Prevent accidentely reonboarding a paid profile customer site feat: Prevent accidental re-onboarding of customer site which was onboarded earlier with paid profile Mar 20, 2026
Copy link
Member

@solaris007 solaris007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @tkotthakota-adobe,

Strengths

  • Guard placement before side effects (src/support/utils.js:1212-1228). The say() messages were correctly moved after the guard check so users don't see "Starting environment setup..." noise for a site that immediately gets blocked. Thoughtful UX.
  • Clean separation of UI and logic (src/support/slack/actions/onboard-modal.js:433-460). The modal gets a checkbox, the value flows through additionalParams.force, and the guard in onboardSingleSite is agnostic to the UI layer. The batch path simply never passes force, which is the correct default.
  • Well-structured dedicated test file (test/support/utils-paid-profile-guard.test.js). Separating the guard tests rather than cramming them into the already-large utils test file is the right call. The assertGuardNotTriggered helper is a clean pattern, and the esmock isolation is solid.
  • Optional chaining handles null/undefined at every level (utils.js:1219). existingSite?.getConfig()?.getHandlers()?.lastOnboardProfile correctly handles absent sites and missing config without throwing.

Issues

Critical (Must Fix)

1. lastOnboardProfile stored in wrong namespace via direct state mutation - src/support/utils.js:1380-1385

Three problems here:

  • The handlers map is for audit/import handler configuration - its Joi schema expects object values, not plain strings. Writing a string value violates the schema contract (it passes today only because validation swallows errors).
  • Direct mutation of siteConfig.state.handlers bypasses the Config model's API. If getHandlers() returns a frozen object or a defensive copy, the mutation silently does nothing.
  • No test verifies this value round-trips through Config.toDynamoItem and back. If the serialization drops unknown fields, the guard is dead code in production while all tests pass.

Fix: Store lastOnboardProfile in a dedicated namespace (e.g., config.onboarding.lastProfile or a first-class Site attribute). Use the Config model's setter methods rather than reaching into state directly. Add an integration test that serializes and deserializes the config to verify the field persists.

2. No migration path for existing sites - guard is unenforceable retroactively - src/support/utils.js:1216

The guard reads lastOnboardProfile from existing site config. Every site onboarded before this PR lands has no such field, so the guard is a no-op for the entire existing fleet. The sites most at risk (current paid customers) are the ones left unprotected.

Fix: Either (a) provide a one-time backfill script that sets lastOnboardProfile for sites currently known to have paid profiles, or (b) infer paid status from existing config capabilities (e.g., presence of paid-tier imports or scheduledRun: true) rather than relying solely on a stored label. At minimum, document this limitation in the PR description.

Important (Should Fix)

3. Duplicate findByBaseURL call adds unnecessary latency - src/support/utils.js:1215

The guard calls SiteLookup.findByBaseURL(baseURL) to check for an existing site. The function already performs a site lookup further down in createSiteAndOrganization. This doubles DB reads for every onboarding call - in CSV batch mode with many rows, that adds up. It also creates a TOCTOU window where concurrent onboards of the same URL could race past each other.

Fix: Hoist the existing site-existence check above the guard and share the result, or move the guard into the block where the existing site is already available.

4. No error handling around the guard DB lookup - src/support/utils.js:1215

If findByBaseURL throws (DB timeout, connection error), the exception propagates through the outer try block and the user sees a confusing SFN error message. A transient DB error now blocks onboarding entirely.

Fix: Wrap the guard block in its own try-catch. On error, log a warning and proceed (fail-open), or fail with a clear error message that identifies the guard as the failure point.

5. CSV batch and slash command have no force override - src/support/utils.js:1213

The modal has a Force Onboard checkbox, but the CSV batch path and slash command have no equivalent. An operator who legitimately needs to re-onboard a paid site via CSV is stuck - they would need to onboard each blocked site individually through the modal.

Fix: Support a force column in the CSV schema, or provide a command-level flag. Document which path to use for intentional re-onboarding.

6. PAID_PROFILES is a fragile, disconnected constant - src/support/utils.js:1154

const PAID_PROFILES = ['paid'] creates a second source of truth for which profiles are "protected," disconnected from the actual profile definitions in profiles.json. When a new paid tier is added, someone must remember to update this array in a separate file.

Fix: Derive the paid/protected classification from the profile definition itself (e.g., add a protected: true flag in profiles.json) or load from a shared constant.

7. No audit trail when force override is used

When force=true bypasses the guard, nothing is logged. There is no way to answer "who force-re-onboarded this paid site and when?" after the fact.

Fix: Add a log.warn when force bypasses the guard for a known paid site, including the site URL, previous profile, and new profile.

8. Guard message references modal in non-modal contexts - src/support/utils.js:1220

The blocked-site message says "select Force Onboard in the modal and resubmit." When triggered from CSV batch or slash command, there is no modal. The message is misleading.

Fix: Make the message context-aware, or generalize it (e.g., "To override, use force mode").

Minor (Nice to Have)

9. Case-sensitive profile matching - src/support/utils.js:1213

PAID_PROFILES.includes(profileName) is case-sensitive. If a caller passes 'Paid' or 'PAID', the guard is silently bypassed.

10. Test gap: assertGuardNotTriggered gives false confidence - test/support/utils-paid-profile-guard.test.js

The helper catches all errors and only checks that say() wasn't called with the warning pattern. If a future bug causes the guard to throw before reaching say(), the test still passes - but the guard is broken. Consider also asserting result.status !== 'Failed' for the expected-to-pass cases.

11. Only one blocking scenario tested - test/support/utils-paid-profile-guard.test.js

Only paid -> demo is tested as a blocking case. The guard would also block paid -> test, paid -> free_trial, etc. Adding at least one more blocking test would confirm the condition generalizes.

Recommendations

  • The Config model needs a proper metadata namespace. This PR is hitting a real limitation - there is no good place to store onboarding metadata on a site's config. Adding an onboarding section to the Config schema in spacecat-shared-data-access would prevent future PRs from also reaching for handlers as a dumping ground.
  • Consider a profile-comparison approach instead of label-based. Comparing effective capability sets (which audits/imports are enabled) rather than stored labels makes the guard resilient to profile renaming and the retroactive gap.
  • Group the guard check and the profile-write closer together in the code. Currently they are ~170 lines apart, making it easy for future changes to break one without noticing the other.

Assessment

Ready to merge? No - with fixes.

The core idea is sound and worth shipping. The guard logic is clean, the test structure is solid, and the UX is thoughtful. However, the siteConfig.state.handlers mutation may silently fail to persist (Critical #1), leaving the guard inert in production with no test to catch it. The lack of backfill (Critical #2) means the entire existing paid fleet is unprotected. These are structural issues that need resolution before merge - they determine whether this feature actually works.

Next Steps

  1. Address the 2 critical issues first - fix the storage namespace and verify persistence with a round-trip test, then decide on a backfill strategy.
  2. Review the 6 important items - the duplicate DB lookup, error handling, force override for non-modal paths, and audit logging are all straightforward fixes.
  3. Minor items are optional but would strengthen the implementation.

@tkotthakota-adobe
Copy link
Contributor Author

tkotthakota-adobe commented Mar 24, 2026

@solaris007 Thanks for the detailed review. Addressed review comments. I am working on testing. Will post test results later.
image

@tkotthakota-adobe
Copy link
Contributor Author

@solaris007 related PR: adobe/spacecat-shared#1464

Copy link
Member

@solaris007 solaris007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @tkotthakota-adobe,

Thanks for the thorough rework - this is a significant improvement over the first iteration.

Previously Flagged Issues - Resolution Status

All 11 findings from the prior review have been addressed:

  • Critical #1 (wrong namespace/direct state mutation): Resolved - now uses siteConfig.updateOnboardConfig() via the Config model API.
  • Critical #2 (no backfill for existing sites): Resolved by design change - the guard now checks getImports()?.['ahref-paid-pages'], data that already exists on every paid-onboarded site. No migration needed.
  • Important #3 (duplicate DB call): Resolved - single prefetchedSite lookup shared via parameter.
  • Important #4 (no error handling): Resolved - try-catch with fail-open and log.warn.
  • Important #5 (CSV/command no force): Resolved - CSV batch intentionally excludes force with clear comment explaining the rationale.
  • Important #6 (hardcoded PAID_PROFILES): Resolved - replaced with profile.protected flag in profiles.json.
  • Important #7 (no audit trail): Resolved - log.warn + Slack message on force override, plus updateOnboardConfig persists lastProfile and lastStartTime.
  • Important #8 (modal-only message): Resolved - message now references /onboard command.
  • Minor #9-11 (case sensitivity, test gaps): All resolved or no longer relevant due to the design change.

Strengths

  • Guard detection via existing data (src/support/utils.js:1311). Checking getImports()?.['ahref-paid-pages'] instead of a stored profile label is a materially better approach. It uses data already in production, requires no migration, and cannot drift out of sync with reality.
  • profile.protected is the right abstraction (static/onboard/profiles.json:114). Collocating the protection flag with the profile definition means adding a new protected profile is a one-line JSON change. No code modification needed.
  • Fail-open on lookup error (src/support/utils.js:1301-1305). The guard is protective, not load-bearing - a transient DB error should not block onboarding. Correct resilience posture.
  • prefetchedSite pass-through (src/support/utils.js:1369). Eliminates the redundant DB call without changing the contract for other callers of createSiteAndOrganization (they still get the internal lookup via the ?? await fallback).
  • CSV batch exclusion with clear rationale (src/support/slack/commands/onboard.js:210-214). Batch operations amplify mistakes. Requiring per-site confirmation via the modal is a proportionate control.

Issues

Critical (Must Fix)

1. Gist-hosted tarball dependency must not ship - package.json:80

@adobe/spacecat-shared-data-access is installed from a personal GitHub Gist tarball instead of the npm registry:

https://gist.github.com/tkotthakota-adobe/.../adobe-spacecat-shared-data-access-3.29.0.tgz

This is clearly a development workaround while spacecat-shared#1464 is still open, but it must be reverted before merge:

  • Gist content is mutable - the author (or anyone with account access) can silently replace the tarball
  • No provenance connecting the tarball to a specific git commit or CI build
  • Bypasses npm audit, Dependabot, and all supply chain tooling
  • If the gist is deleted, npm install breaks for every developer and CI run
  • The package-lock.json peer: true noise is a side effect of this - it resolves automatically when reverted

Fix: Merge and publish spacecat-shared#1464 first, then update this PR to the published npm version. Alternatively, decouple this PR: the guard logic (lines 1298-1326) is completely independent of updateOnboardConfig and could land now without the shared lib dependency. The config tracking could follow in a separate PR once the shared lib publishes.

Important (Should Fix)

2. updateOnboardConfig is never exercised by any test - src/support/utils.js:1481

The updateOnboardConfig({ lastProfile, lastStartTime }) call sits on the happy path of every successful onboarding. The guard tests intentionally fail before reaching it. The audits.test.js additions add the getter mock (getOnboardConfig) but no test calls the setter. If this method is absent or has a different signature in the published Config model, every successful onboarding will throw a TypeError in production with no test catching it.

Fix: Add a test that reaches the updateOnboardConfig call - either a full happy-path integration test, or at minimum a unit test that stubs the Config model and asserts the method was called with the expected shape.

3. Force override path has no positive assertion - test/support/utils-paid-profile-guard.test.js:79

The force=true test uses assertGuardNotTriggered, which only verifies the blocking message was NOT sent. It does not assert that the force warning message WAS sent. If the say() in the force branch is accidentally removed, the test still passes.

Fix: Add expect(sayStub).to.have.been.calledWith(sinon.match(/Force re-onboarding/)) to the force=true test.

4. Em-dash in guard message - src/support/utils.js:1315

await say(`:warning: Force re-onboarding \`${baseURL}\` — overriding paid profile...`);

The em-dash should be a regular dash (-) per project conventions.

Minor (Nice to Have)

5. ahref-paid-pages is a brittle proxy for "paid profile" - src/support/utils.js:1311

The guard checks for a specific import key as a signal the site was previously paid-onboarded. This works today but creates a hidden coupling - if this import is added to another profile, or a future paid variant omits it, the guard silently breaks. Consider adding a code comment explaining the coupling, or extract the key as a named constant shared with the profile definition.

6. say() ordering in the blocking path - src/support/utils.js:1318-1323

The guard calls await say(msg) before setting reportLine.errors. If say() throws (Slack API down), the report line is never populated. Consider setting reportLine first, then calling say().

7. updateOnboardConfig does not record force override - src/support/utils.js:1481

When force=true, the config only records lastProfile and lastStartTime. A future audit cannot distinguish a forced override from a normal onboarding. Consider adding forcedOverride: true to the payload when additionalParams.force is set.

Recommendations

  • Decouple the guard from updateOnboardConfig. The guard logic is completely independent of the shared lib change. Landing the guard now (without updateOnboardConfig) gets the safety feature into production faster. The config tracking can follow once spacecat-shared#1464 publishes.
  • Add a CI lint rule to flag non-registry dependency URLs in package.json (gist, raw.githubusercontent.com, arbitrary tarballs). A one-liner grep in CI would catch this class of issue before it reaches review.

Assessment

Ready to merge? No - one blocker remaining.

The architectural rework is excellent. The import-based detection, profile.protected flag, fail-open guard, prefetched site reuse, and CSV force exclusion are all well-designed. Every finding from the prior review has been addressed. The sole blocker is the gist-hosted dependency (Critical #1) - once that is resolved (either by publishing spacecat-shared#1464 or by decoupling the guard from updateOnboardConfig), this is ready to go. The Important items (#2-#4) are straightforward fixes.

Next Steps

  1. Resolve the dependency: publish spacecat-shared#1464 and update to the npm version, or split this PR to land the guard independently.
  2. Address Important #2-#4 (untested updateOnboardConfig, force assertion, em-dash).
  3. Minor items are optional.

package.json Outdated
"@adobe/spacecat-shared-brand-client": "1.1.39",
"@adobe/spacecat-shared-content-client": "1.8.19",
"@adobe/spacecat-shared-data-access": "^3.29.0",
"@adobe/spacecat-shared-data-access": "https://gist.github.com/tkotthakota-adobe/f672321057d6b20cae9981718c4c53d9/raw/33a7b3b76c8c94b7b98909faa27c833189bcbac8/adobe-spacecat-shared-data-access-3.29.0.tgz",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical: This gist-hosted tarball must be reverted to an npm registry reference before merge. Merge and publish spacecat-shared#1464 first, then pin the published version. Alternatively, decouple this PR - the guard logic is independent of updateOnboardConfig and could land now without the shared lib dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be changed after merged the spacecat-shared-data-access change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants