Conversation
…es and that we only spawn one and only one process for a given name, across threads
Ethan-Arrowood
left a comment
There was a problem hiding this comment.
I wont explicitly approve or request changes in case you want to complete my recommendations in follow up work. I'm fine with these changes landing as is and we can iterate. I feel most strongly about making things configurable and being open by default. harper-pro / fabric can lock things down by default, but the harper core should remain as open as possible.
| class ExistingProcessWrapper extends EventEmitter { | ||
| pid: number; | ||
| private checkInterval: NodeJS.Timeout; | ||
|
|
||
| constructor(pid: number) { | ||
| super(); | ||
| this.pid = pid; | ||
|
|
||
| // Monitor process and emit exit event when it terminates | ||
| this.checkInterval = setInterval(() => { | ||
| try { | ||
| // Signal 0 checks if process exists without actually killing it | ||
| process.kill(pid, 0); | ||
| } catch { | ||
| // Process no longer exists | ||
| clearInterval(this.checkInterval); | ||
| this.emit('exit', null, null); | ||
| } | ||
| }, 1000); | ||
| } | ||
|
|
||
| // Kill the process | ||
| kill(signal?: NodeJS.Signals | number) { | ||
| try { | ||
| process.kill(this.pid, signal); | ||
| return true; | ||
| } catch { | ||
| return false; | ||
| } | ||
| } | ||
|
|
||
| // Clean up interval when wrapper is no longer needed | ||
| unref() { | ||
| clearInterval(this.checkInterval); | ||
| return this; | ||
| } | ||
| } |
There was a problem hiding this comment.
| class ExistingProcessWrapper extends EventEmitter { | |
| pid: number; | |
| private checkInterval: NodeJS.Timeout; | |
| constructor(pid: number) { | |
| super(); | |
| this.pid = pid; | |
| // Monitor process and emit exit event when it terminates | |
| this.checkInterval = setInterval(() => { | |
| try { | |
| // Signal 0 checks if process exists without actually killing it | |
| process.kill(pid, 0); | |
| } catch { | |
| // Process no longer exists | |
| clearInterval(this.checkInterval); | |
| this.emit('exit', null, null); | |
| } | |
| }, 1000); | |
| } | |
| // Kill the process | |
| kill(signal?: NodeJS.Signals | number) { | |
| try { | |
| process.kill(this.pid, signal); | |
| return true; | |
| } catch { | |
| return false; | |
| } | |
| } | |
| // Clean up interval when wrapper is no longer needed | |
| unref() { | |
| clearInterval(this.checkInterval); | |
| return this; | |
| } | |
| } | |
| class ExistingProcessWrapper extends EventEmitter { | |
| #pid: number; | |
| #checkInterval: NodeJS.Timeout; | |
| constructor(pid: number) { | |
| super(); | |
| this.#pid = pid; | |
| // Monitor process and emit exit event when it terminates | |
| this.#checkInterval = setInterval(() => { | |
| try { | |
| // Signal 0 checks if process exists without actually killing it | |
| process.kill(pid, 0); | |
| } catch { | |
| // Process no longer exists | |
| clearInterval(this.#checkInterval); | |
| this.emit('exit', null, null); | |
| } | |
| }, 1000); | |
| } | |
| get pid() { | |
| return this.#pid; | |
| } | |
| // Kill the process | |
| kill(signal?: NodeJS.Signals | number) { | |
| try { | |
| process.kill(this.#pid, signal); | |
| return true; | |
| } catch { | |
| return false; | |
| } | |
| } | |
| // Clean up interval when wrapper is no longer needed | |
| unref() { | |
| clearInterval(this.#checkInterval); | |
| return this; | |
| } | |
| } |
Replace private with # so it is actually private. The private keyword is compile time only and has no runtime enforcement.
There was a problem hiding this comment.
ok, sure, but pid wasn't really intended to be private. In trying match node, it is just a plain ol' public property. But yeah, #checkInterval sounds good.
There was a problem hiding this comment.
I assume we wouldn't want the user to be able to set the pid though? Even accidentally? Thats what I guarding for by switching to a getter pattern
There was a problem hiding this comment.
Sounds nice. I think you meant to file a bug with node.js, not comment on a PR ;) (I am just following what node has defined).
There was a problem hiding this comment.
Haha yes, I assumed Node.js had this property locked down so users cannot set it. But maybe that is an invalid assumption.
security/jsLoader.ts
Outdated
| 'fs', | ||
| ]); | ||
| function checkAllowedModulePath(moduleUrl: string, containingFolder: string): boolean { | ||
| const ALLOWED_COMMANDS = new Set(['next', 'npm', 'node']); |
There was a problem hiding this comment.
This should be configurable and expandable somehow. Maybe plugins should have a way to register commands? And the security contract is that if a user trusts a plugin they trust whatever commands that plugin may register as well?
From the perspective of an open core, we need to ensure maximum configurability so that we don't unnecessarily restrict open usage.
There was a problem hiding this comment.
Yeah, I could make this configurable 👍
FWIW, the plugins were running outside the VM, but was changing that with this PR #209
Primarily because it is somewhat confusing that we can provide a consistent behavior of import from 'harper' or 'harperdb' for application (via the VM), but can't do so with the plugin. But yeah, we could also give plugins a way to register commands.
security/jsLoader.ts
Outdated
|
|
||
| function createSpawn(spawnFunction: (...args: any) => child_process.ChildProcess, alwaysAllow?: boolean) { | ||
| const basePath = env.getHdbBasePath(); | ||
| return function (command: string, args: any, options: any) { |
There was a problem hiding this comment.
I don't think it's a good idea to change the function signature of exec() or execFile().
Ideally, the REPLACED_BUILTIN_MODULES would have each function's signature in tact.
I have an idea... bear with me...
// pull in some child_process types:
import * as child_process, { type ChildProcess, type ExecException, type ExecOptions } from 'node:child_process';
import { mkdirSync, writeFileSync, rmSync } from 'node:fs';
import { join } from 'node:path';
// cache this in a global since it doesn't change
const basePath = env.getHdbBasePath();
function checkCommand(fnName: string, command: string, name?: string, alwaysAllow: boolean = false): { existingPid: number | null, pidFilePath: string } {
if (!ALLOWED_COMMANDS.has(command) && !alwaysAllow) {
throw new Error(`Command ${command} is not allowed`);
}
const processName = name;
if (!processName)
throw new Error(
`Calling ${fnName} in Harper must have a process "name" in the options to ensure that a single process is started and reused`
);
// Ensure PID directory exists
const pidDir = join(basePath, 'pids');
mkdirSync(pidDir, { recursive: true });
const pidFilePath = join(pidDir, `${processName}.pid`);
// Try to acquire lock - returns null if we got the lock, or PID if process exists
const existingPid = acquirePidFileLock(pidFilePath);
return { existingPid, pidFilePath };
}
function createPidFile(child: ChildProcess, pidFilePath: string): ChildProcess {
// Write PID to the file we just created
const { pid } = child; // TIL: `pid` can be undefined!
if (pid === undefined) {
// child failed, let the caller handle `.on('error')`
return child;
}
try {
writeFileSync(pidFilePath, pid.toString(), 'utf-8');
} catch (err) {
// Failed to write PID, clean up
child.kill();
rmSync(pidFilePath, { force: true });
throw err;
}
// Clean up PID file when process exits
child.on('exit', () => rmSync(pidFilePath, { force: true }));
return child;
}
// you could shove this in a types file
type ChildProcessWrapper = {
exec(command: string): ChildProcess;
exec(
command: string,
options: ({ encoding: 'buffer' | null } & ExecOptions) | null
): ChildProcess;
exec(
command: string,
callback: (error: ExecException | null, stdout: string, stderr: string) => void
): ChildProcess;
exec(
command: string,
options: ({ encoding: 'buffer' | null } & ExecOptions) | ExecOptions | null,
callback: (error: ExecException | null, stdout: string | Buffer, stderr: string | Buffer) => void
): ChildProcess;
/* execFile, fork, spawn */
};
const REPLACED_BUILTIN_MODULES = {
child_process: {
exec(command: string, options: ExecOptions & { name?: string }, callback?: (error: ExecException | null, stdout: string, stderr: string) => void): ChildProcess {
const { existingPid, pidFilePath } = checkCommand('exec', command.split(' ')[0], options?.name);
if (typeof existingPid === 'number') {
return new ExistingProcessWrapper(existingPid);
}
return createPidFile(child_process.exec(command, options, callback), pidFilePath);
},
/* execFile, fork, spawn */
} as ChildProcessWrapper
};It's a bit more verbose, but it will preserve the function signature types and it will help eliminate a few any types. Whatcha think?
There was a problem hiding this comment.
I am not sure I am following. We are not directly exposing these functions to users, so I am not sure what it means to "preserve the function signature types". Users will be invoking these functions with import { spawn } from 'node:fs'. And AFAIK, the type system will always resolve those types to node's provided types, nothing in this code effects that at all does it?
There was a problem hiding this comment.
Yes, TypeScript should pick up the types from @types/node. Ignore 99% my comment. The function signature for execFile is missing the 4th parameter: child_process.execFile(file[, args][, options][, callback]).
static/defaultConfig.yaml
Outdated
| lockdown: freeze | ||
| containment: vm | ||
| dependencyContainment: false | ||
| allowedShellCommands: |
There was a problem hiding this comment.
| allowedShellCommands: | |
| allowedSpawnCommands: |
There was a problem hiding this comment.
Ugh, just fixed this, this would have saved me a few minutes if I had looked at the PR first :).
| get() {}, // make it look like a resource | ||
| testFork() { | ||
| // Fork should work (allowed command) | ||
| const child = fork('next', ['--version'], { name: 'test-next-process' }); |
There was a problem hiding this comment.
This is going to fail now, no?
| return true; | ||
| }, | ||
| }; | ||
| const ALLOWED_COMMANDS = new Set(env.get(CONFIG_PARAMS.APPLICATIONS_ALLOWEDSPAWNCOMMANDS) ?? []); |
There was a problem hiding this comment.
Very nice solution here. So by default, all built-in modules are allowed, but no commands. Should we allow all commands by default too? Or is that too much of a security risk?
There was a problem hiding this comment.
Yeah, allowing all things except a block list isn't a very good security posture. Part of my intent here is to try lock down things more with this major version upgrade. And then we can easily expand and allow more after v5.
Override child_process module to ensure we only spawn trusted processes and that we only spawn one and only one process for a given name, across threads