Skip to content

Incorrect set comparison between package names and PURLs in NPM sync logic #890

Description

@chinyeungli

This line https://github.com/aboutcode-org/purldb/blob/main/minecode_pipelines/pipes/npm.py#L234
packages_to_sync = list(set(packages).difference(set(synced_packages)))

does not work because the the "synced_packages" is a list of PURLs (see https://github.com/ghraw/aboutcode-data/minecode-pipelines-config/refs/heads/main/npm/packages_checkpoint.json) while the "packages" is a list of the package names (see https://github.com/aboutcode-org/purldb/blob/main/minecode_pipelines/miners/npm.py#L129)

In another word, the elements in these two sets never match, so everything between these 2 sets are always different.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions