feat: Prevent accidental re-onboarding of customer site which was onboarded earlier with paid profile#2007
feat: Prevent accidental re-onboarding of customer site which was onboarded earlier with paid profile#2007tkotthakota-adobe wants to merge 12 commits intomainfrom
Conversation
|
This PR will trigger no release when merged. |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
solaris007
left a comment
There was a problem hiding this comment.
Hey @tkotthakota-adobe,
Strengths
- Guard placement before side effects (
src/support/utils.js:1212-1228). Thesay()messages were correctly moved after the guard check so users don't see "Starting environment setup..." noise for a site that immediately gets blocked. Thoughtful UX. - Clean separation of UI and logic (
src/support/slack/actions/onboard-modal.js:433-460). The modal gets a checkbox, the value flows throughadditionalParams.force, and the guard inonboardSingleSiteis agnostic to the UI layer. The batch path simply never passesforce, which is the correct default. - Well-structured dedicated test file (
test/support/utils-paid-profile-guard.test.js). Separating the guard tests rather than cramming them into the already-large utils test file is the right call. TheassertGuardNotTriggeredhelper is a clean pattern, and the esmock isolation is solid. - Optional chaining handles null/undefined at every level (
utils.js:1219).existingSite?.getConfig()?.getHandlers()?.lastOnboardProfilecorrectly handles absent sites and missing config without throwing.
Issues
Critical (Must Fix)
1. lastOnboardProfile stored in wrong namespace via direct state mutation - src/support/utils.js:1380-1385
Three problems here:
- The
handlersmap is for audit/import handler configuration - its Joi schema expects object values, not plain strings. Writing a string value violates the schema contract (it passes today only because validation swallows errors). - Direct mutation of
siteConfig.state.handlersbypasses the Config model's API. IfgetHandlers()returns a frozen object or a defensive copy, the mutation silently does nothing. - No test verifies this value round-trips through
Config.toDynamoItemand back. If the serialization drops unknown fields, the guard is dead code in production while all tests pass.
Fix: Store lastOnboardProfile in a dedicated namespace (e.g., config.onboarding.lastProfile or a first-class Site attribute). Use the Config model's setter methods rather than reaching into state directly. Add an integration test that serializes and deserializes the config to verify the field persists.
2. No migration path for existing sites - guard is unenforceable retroactively - src/support/utils.js:1216
The guard reads lastOnboardProfile from existing site config. Every site onboarded before this PR lands has no such field, so the guard is a no-op for the entire existing fleet. The sites most at risk (current paid customers) are the ones left unprotected.
Fix: Either (a) provide a one-time backfill script that sets lastOnboardProfile for sites currently known to have paid profiles, or (b) infer paid status from existing config capabilities (e.g., presence of paid-tier imports or scheduledRun: true) rather than relying solely on a stored label. At minimum, document this limitation in the PR description.
Important (Should Fix)
3. Duplicate findByBaseURL call adds unnecessary latency - src/support/utils.js:1215
The guard calls SiteLookup.findByBaseURL(baseURL) to check for an existing site. The function already performs a site lookup further down in createSiteAndOrganization. This doubles DB reads for every onboarding call - in CSV batch mode with many rows, that adds up. It also creates a TOCTOU window where concurrent onboards of the same URL could race past each other.
Fix: Hoist the existing site-existence check above the guard and share the result, or move the guard into the block where the existing site is already available.
4. No error handling around the guard DB lookup - src/support/utils.js:1215
If findByBaseURL throws (DB timeout, connection error), the exception propagates through the outer try block and the user sees a confusing SFN error message. A transient DB error now blocks onboarding entirely.
Fix: Wrap the guard block in its own try-catch. On error, log a warning and proceed (fail-open), or fail with a clear error message that identifies the guard as the failure point.
5. CSV batch and slash command have no force override - src/support/utils.js:1213
The modal has a Force Onboard checkbox, but the CSV batch path and slash command have no equivalent. An operator who legitimately needs to re-onboard a paid site via CSV is stuck - they would need to onboard each blocked site individually through the modal.
Fix: Support a force column in the CSV schema, or provide a command-level flag. Document which path to use for intentional re-onboarding.
6. PAID_PROFILES is a fragile, disconnected constant - src/support/utils.js:1154
const PAID_PROFILES = ['paid'] creates a second source of truth for which profiles are "protected," disconnected from the actual profile definitions in profiles.json. When a new paid tier is added, someone must remember to update this array in a separate file.
Fix: Derive the paid/protected classification from the profile definition itself (e.g., add a protected: true flag in profiles.json) or load from a shared constant.
7. No audit trail when force override is used
When force=true bypasses the guard, nothing is logged. There is no way to answer "who force-re-onboarded this paid site and when?" after the fact.
Fix: Add a log.warn when force bypasses the guard for a known paid site, including the site URL, previous profile, and new profile.
8. Guard message references modal in non-modal contexts - src/support/utils.js:1220
The blocked-site message says "select Force Onboard in the modal and resubmit." When triggered from CSV batch or slash command, there is no modal. The message is misleading.
Fix: Make the message context-aware, or generalize it (e.g., "To override, use force mode").
Minor (Nice to Have)
9. Case-sensitive profile matching - src/support/utils.js:1213
PAID_PROFILES.includes(profileName) is case-sensitive. If a caller passes 'Paid' or 'PAID', the guard is silently bypassed.
10. Test gap: assertGuardNotTriggered gives false confidence - test/support/utils-paid-profile-guard.test.js
The helper catches all errors and only checks that say() wasn't called with the warning pattern. If a future bug causes the guard to throw before reaching say(), the test still passes - but the guard is broken. Consider also asserting result.status !== 'Failed' for the expected-to-pass cases.
11. Only one blocking scenario tested - test/support/utils-paid-profile-guard.test.js
Only paid -> demo is tested as a blocking case. The guard would also block paid -> test, paid -> free_trial, etc. Adding at least one more blocking test would confirm the condition generalizes.
Recommendations
- The Config model needs a proper metadata namespace. This PR is hitting a real limitation - there is no good place to store onboarding metadata on a site's config. Adding an
onboardingsection to the Config schema in spacecat-shared-data-access would prevent future PRs from also reaching forhandlersas a dumping ground. - Consider a profile-comparison approach instead of label-based. Comparing effective capability sets (which audits/imports are enabled) rather than stored labels makes the guard resilient to profile renaming and the retroactive gap.
- Group the guard check and the profile-write closer together in the code. Currently they are ~170 lines apart, making it easy for future changes to break one without noticing the other.
Assessment
Ready to merge? No - with fixes.
The core idea is sound and worth shipping. The guard logic is clean, the test structure is solid, and the UX is thoughtful. However, the siteConfig.state.handlers mutation may silently fail to persist (Critical #1), leaving the guard inert in production with no test to catch it. The lack of backfill (Critical #2) means the entire existing paid fleet is unprotected. These are structural issues that need resolution before merge - they determine whether this feature actually works.
Next Steps
- Address the 2 critical issues first - fix the storage namespace and verify persistence with a round-trip test, then decide on a backfill strategy.
- Review the 6 important items - the duplicate DB lookup, error handling, force override for non-modal paths, and audit logging are all straightforward fixes.
- Minor items are optional but would strengthen the implementation.
|
@solaris007 Thanks for the detailed review. Addressed review comments. I am working on testing. Will post test results later. |
|
@solaris007 related PR: adobe/spacecat-shared#1464 |
solaris007
left a comment
There was a problem hiding this comment.
Hey @tkotthakota-adobe,
Thanks for the thorough rework - this is a significant improvement over the first iteration.
Previously Flagged Issues - Resolution Status
All 11 findings from the prior review have been addressed:
- Critical #1 (wrong namespace/direct state mutation): Resolved - now uses
siteConfig.updateOnboardConfig()via the Config model API. - Critical #2 (no backfill for existing sites): Resolved by design change - the guard now checks
getImports()?.['ahref-paid-pages'], data that already exists on every paid-onboarded site. No migration needed. - Important #3 (duplicate DB call): Resolved - single
prefetchedSitelookup shared via parameter. - Important #4 (no error handling): Resolved - try-catch with fail-open and
log.warn. - Important #5 (CSV/command no force): Resolved - CSV batch intentionally excludes force with clear comment explaining the rationale.
- Important #6 (hardcoded PAID_PROFILES): Resolved - replaced with
profile.protectedflag inprofiles.json. - Important #7 (no audit trail): Resolved -
log.warn+ Slack message on force override, plusupdateOnboardConfigpersistslastProfileandlastStartTime. - Important #8 (modal-only message): Resolved - message now references
/onboardcommand. - Minor #9-11 (case sensitivity, test gaps): All resolved or no longer relevant due to the design change.
Strengths
- Guard detection via existing data (
src/support/utils.js:1311). CheckinggetImports()?.['ahref-paid-pages']instead of a stored profile label is a materially better approach. It uses data already in production, requires no migration, and cannot drift out of sync with reality. profile.protectedis the right abstraction (static/onboard/profiles.json:114). Collocating the protection flag with the profile definition means adding a new protected profile is a one-line JSON change. No code modification needed.- Fail-open on lookup error (
src/support/utils.js:1301-1305). The guard is protective, not load-bearing - a transient DB error should not block onboarding. Correct resilience posture. prefetchedSitepass-through (src/support/utils.js:1369). Eliminates the redundant DB call without changing the contract for other callers ofcreateSiteAndOrganization(they still get the internal lookup via the?? awaitfallback).- CSV batch exclusion with clear rationale (
src/support/slack/commands/onboard.js:210-214). Batch operations amplify mistakes. Requiring per-site confirmation via the modal is a proportionate control.
Issues
Critical (Must Fix)
1. Gist-hosted tarball dependency must not ship - package.json:80
@adobe/spacecat-shared-data-access is installed from a personal GitHub Gist tarball instead of the npm registry:
https://gist.github.com/tkotthakota-adobe/.../adobe-spacecat-shared-data-access-3.29.0.tgz
This is clearly a development workaround while spacecat-shared#1464 is still open, but it must be reverted before merge:
- Gist content is mutable - the author (or anyone with account access) can silently replace the tarball
- No provenance connecting the tarball to a specific git commit or CI build
- Bypasses npm audit, Dependabot, and all supply chain tooling
- If the gist is deleted,
npm installbreaks for every developer and CI run - The
package-lock.jsonpeer: truenoise is a side effect of this - it resolves automatically when reverted
Fix: Merge and publish spacecat-shared#1464 first, then update this PR to the published npm version. Alternatively, decouple this PR: the guard logic (lines 1298-1326) is completely independent of updateOnboardConfig and could land now without the shared lib dependency. The config tracking could follow in a separate PR once the shared lib publishes.
Important (Should Fix)
2. updateOnboardConfig is never exercised by any test - src/support/utils.js:1481
The updateOnboardConfig({ lastProfile, lastStartTime }) call sits on the happy path of every successful onboarding. The guard tests intentionally fail before reaching it. The audits.test.js additions add the getter mock (getOnboardConfig) but no test calls the setter. If this method is absent or has a different signature in the published Config model, every successful onboarding will throw a TypeError in production with no test catching it.
Fix: Add a test that reaches the updateOnboardConfig call - either a full happy-path integration test, or at minimum a unit test that stubs the Config model and asserts the method was called with the expected shape.
3. Force override path has no positive assertion - test/support/utils-paid-profile-guard.test.js:79
The force=true test uses assertGuardNotTriggered, which only verifies the blocking message was NOT sent. It does not assert that the force warning message WAS sent. If the say() in the force branch is accidentally removed, the test still passes.
Fix: Add expect(sayStub).to.have.been.calledWith(sinon.match(/Force re-onboarding/)) to the force=true test.
4. Em-dash in guard message - src/support/utils.js:1315
await say(`:warning: Force re-onboarding \`${baseURL}\` — overriding paid profile...`);The em-dash should be a regular dash (-) per project conventions.
Minor (Nice to Have)
5. ahref-paid-pages is a brittle proxy for "paid profile" - src/support/utils.js:1311
The guard checks for a specific import key as a signal the site was previously paid-onboarded. This works today but creates a hidden coupling - if this import is added to another profile, or a future paid variant omits it, the guard silently breaks. Consider adding a code comment explaining the coupling, or extract the key as a named constant shared with the profile definition.
6. say() ordering in the blocking path - src/support/utils.js:1318-1323
The guard calls await say(msg) before setting reportLine.errors. If say() throws (Slack API down), the report line is never populated. Consider setting reportLine first, then calling say().
7. updateOnboardConfig does not record force override - src/support/utils.js:1481
When force=true, the config only records lastProfile and lastStartTime. A future audit cannot distinguish a forced override from a normal onboarding. Consider adding forcedOverride: true to the payload when additionalParams.force is set.
Recommendations
- Decouple the guard from
updateOnboardConfig. The guard logic is completely independent of the shared lib change. Landing the guard now (withoutupdateOnboardConfig) gets the safety feature into production faster. The config tracking can follow once spacecat-shared#1464 publishes. - Add a CI lint rule to flag non-registry dependency URLs in
package.json(gist, raw.githubusercontent.com, arbitrary tarballs). A one-liner grep in CI would catch this class of issue before it reaches review.
Assessment
Ready to merge? No - one blocker remaining.
The architectural rework is excellent. The import-based detection, profile.protected flag, fail-open guard, prefetched site reuse, and CSV force exclusion are all well-designed. Every finding from the prior review has been addressed. The sole blocker is the gist-hosted dependency (Critical #1) - once that is resolved (either by publishing spacecat-shared#1464 or by decoupling the guard from updateOnboardConfig), this is ready to go. The Important items (#2-#4) are straightforward fixes.
Next Steps
package.json
Outdated
| "@adobe/spacecat-shared-brand-client": "1.1.39", | ||
| "@adobe/spacecat-shared-content-client": "1.8.19", | ||
| "@adobe/spacecat-shared-data-access": "^3.29.0", | ||
| "@adobe/spacecat-shared-data-access": "https://gist.github.com/tkotthakota-adobe/f672321057d6b20cae9981718c4c53d9/raw/33a7b3b76c8c94b7b98909faa27c833189bcbac8/adobe-spacecat-shared-data-access-3.29.0.tgz", |
There was a problem hiding this comment.
Critical: This gist-hosted tarball must be reverted to an npm registry reference before merge. Merge and publish spacecat-shared#1464 first, then pin the published version. Alternatively, decouple this PR - the guard logic is independent of updateOnboardConfig and could land now without the shared lib dependency.
There was a problem hiding this comment.
This will be changed after merged the spacecat-shared-data-access change.

https://jira.corp.adobe.com/browse/SITES-42173
Adds a guard to onboard site command that blocks re-onboarding a paid-profile site with a lower-tier profile to avoid unintentional config changes.
Single site onboarding: An optional Force Onboard checkbox is added to the onboard modal. When checked, it bypasses the guard allowing intentional re-onboarding with a lower-tier profile.
CSV batch onboarding: The guard runs per-site inside the batch loop. Each row is evaluated independently — if a site in the CSV was previously onboarded with a paid profile (paid) and the CSV row specifies a lower-tier profile (e.g. demo, test), that site is skipped with a Slack warning. All other rows in the batch are unaffected and continue to process normally.
Related PR: adobe/spacecat-shared#1464