-
Notifications
You must be signed in to change notification settings - Fork 0
[Submission] temporal-workdir — Remote-backed workspace sync for Temporal activities #91
Description
Project link
https://github.com/saeedseyfi/temporal-workdir
Language
Python
Short description
Syncs a local directory with remote storage (GCS, S3, Azure) before and after Temporal activity execution. Enables file-based activities to work across distributed workers where disk is not shared.
Long Description
Temporal activities that read/write files on local disk can't scale to multiple worker instances. Each worker has its own disk with no shared state. The workaround is manually downloading from and uploading to object storage at the start and end of every activity. Every team reimplements this boilerplate.
Other orchestrators solved this years ago. Flyte has FlyteDirectory for automatic directory sync between tasks. Argo has artifact passing with declarative tarballing between steps. Temporal had no equivalent.
temporal-workdir fills that gap. It provides a Workspace that pulls a remote archive to a local temp directory before the activity runs, and pushes changes back after. The activity works with real files on real disk. No FUSE, no custom APIs.
Context manager (generic):
async with Workspace("gs://bucket/state/job-123") as ws:
data = json.loads((ws.path / "config.json").read_text())
(ws.path / "result.csv").write_text(compute(data))Temporal decorator:
@workspace("gs://bucket/{workflow_id}/{activity_type}")
@activity.defn
async def process(input: ProcessInput) -> ProcessOutput:
ws = get_workspace_path()
...Storage backend auto-detected from URL scheme via fsspec (GCS, S3, Azure, local, memory). Published on PyPI as temporal-workdir.
Author(s)
Saeed Seyfi — https://github.com/saeedseyfi