Skip to content

Web store: downloading all assets is suspiciously slow #394

@pmonks

Description

@pmonks

When using the Web Store (i.e. with the org.spdx.useJARLicenseInfoOnly JVM property set to false), code that forces Spdx-Java-Library to download all assets for all listed licenses and exceptions is suspiciously slow the first time it runs (i.e. when the local cache is empty). For example on my laptop, on a 1GB residential internet connection, it takes over 3 minutes to complete.

In contrast, on the same laptop and network, the following commands (which approximate what Spdx-Java-Library is doing) demonstrate that the download can be accomplished substantially faster (i.e. in ~3 seconds):

$ time sh -c 'git clone -n --depth=1 --filter=tree:0 https://github.com/spdx/license-list-data; \
              cd license-list-data; \
              git sparse-checkout set --no-cone /text /template; \
              git checkout; \
              rm -rf .git; \
              curl -O https://raw.githubusercontent.com/spdx/license-list-data/refs/heads/main/json/licenses.json; \
              curl -O https://raw.githubusercontent.com/spdx/license-list-data/refs/heads/main/json/exceptions.json' > /dev/null 2>&1

real 0m3.065s
user 0m0.302s
sys  0m0.337s

While it's hard to imagine the JVM equaling the performance of git, the massive difference in runtime (3 minutes vs 3 seconds) raises my suspicions that something is awry in the Web Store code path.

Metadata

Metadata

Assignees

No one assigned

    Labels

    performanceSpeed, responsiveness, and memory efficiency

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions