Anchore Open Source Software (OSS) is a suite of tools for Software Bill of Materials (SBOM) Generation, Vulnerability Scanning, License Scanning, and Vulnerability Database management.
Start by going to the project overview of Anchore OSS to learn more about the basic concepts and functions.
Note: Many topics have nested sub-topics in the navigation pane to the left that become visible when you click a topic.
Installing the Tools
The tools are available in many common distribution channels. The full list of official and community maintained packages can be found on the installation page.
Developers also have Contribution Guides for all of our open source tools and libraries.
1 - Projects
Overview of Anchore Open Source tools.
Anchore Open Source Tools
We maintain three popular command-line tools, some libraries, and supporting utilities. Most are written in Go, with a few in Python. They are all released under the Apache-2.0 license. For the full list, see our GitHub org.
Syft
SBOM Generator and library
Syft (pronounced like sift) is an open-source command-line tool and Go library. Its primary function is to scan container images, file systems, and archives to automatically generate a Software Bill of Materials, making it easier to understand the composition of software.
Grype
Vulnerability Scanner
Grype (pronounced like hype) is an open-source vulnerability scanner specifically designed to analyze container images and filesystems. It works by comparing the software components it finds against a database of known vulnerabilities, providing a report of potential risks so they can be addressed.
Grant
License Scanner
Grant is an open-source command-line tool designed to discover and report on the software licenses present in container images, SBOM documents, or filesystems. It helps users understand the licenses of their software dependencies and can check them against user-defined policies to ensure compliance.
Installing the Tools
The tools are available in many common distribution channels. The full list of official and community maintained packages can be found on the installation page.
Developers also have Contribution Guides for all of our open source tools and libraries.
2 - Data Sources
Vulnerability Data Sources
TODO
3 - Installation
Official and community maintained packages of Anchore OSS Tools
3.1 - Syft
Installing Syft
Official builds
The Anchore OSS team publish official source archives and binary builds of Syft for Linux, macOS and Windows. There are also numerous community-maintained builds of the tools for different platforms.
Installer script
Syft binaries are provided for Linux, macOS and Windows.
curl -sSfL https://get.anchore.io/syft | sudo sh -s -- -b /usr/local/bin
Install script options:
-b: Specify a custom installation directory (defaults to ./bin)
-d: More verbose logging levels (-d for debug, -dd for trace)
-v: Verify the signature of the downloaded artifact before installation (requires cosign to be installed)
Updating Syft
Syft checks for new versions on launch. It will print a message at the end of the output if the version in use is not the latest.
A newer version of syft is available for download: 1.20.0 (installed version is 1.19.2)
Docker container
docker pull anchore/syft
GitHub releases
Download the file for your operating system and architecture from the GitHub releases page
In the case of .deb or .rpm, install them using your package manager
For compressed archives, unpack the file, and copy the syft binary to a folder in your path such as /usr/local/bin
The Anchore OSS team publish official source archives and binary builds of Grype for Linux, macOS and Windows. There are also numerous community-maintained builds of the tools for different platforms.
Installer script
Grype binaries are provided for Linux, macOS and Windows.
curl -sSfL https://get.anchore.io/grype | sudo sh -s -- -b /usr/local/bin
Install script options:
-b: Specify a custom installation directory (defaults to ./bin)
-d: More verbose logging levels (-d for debug, -dd for trace)
-v: Verify the signature of the downloaded artifact before installation (requires cosign to be installed)
Updating Grype
Grype checks for new versions on launch. It will print a message at the end of the output if the version in use is not the latest.
A newer version of grype is available for download: 0.92.0 (installed version is 0.91.2)
Docker container
docker pull anchore/grype
GitHub releases
Download the file for your operating system and architecture from the GitHub releases page
In the case of .deb or .rpm, install them using your package manager
For compressed archives, unpack the file, and copy the grype binary to a folder in your path such as /usr/local/bin
Community builds of Grype
Arch Linux
sudo pacman -S grype-bin
Homebrew
brew tap anchore/grype
brew install grype
MacPorts
sudo port install grype
NuGet
nuget install Anchore.Grype
Snapcraft
snap install grype
3.3 - Grant
Installing Grant
Official builds
The Anchore OSS team publish official source archives and binary builds for Linux and macOS. There are also some community-maintained builds of the tools for different platforms.
Installer script
Grant binaries are provided for Linux and macOS.
curl -sSfL https://get.anchore.io/grant | sudo sh -s -- -b /usr/local/bin
Install script options:
-b: Specify a custom installation directory (defaults to ./bin)
-d: More verbose logging levels (-d for debug, -dd for trace)
-v: Verify the signature of the downloaded artifact before installation (requires cosign to be installed)
GitHub releases
Download the file for your operating system and architecture from the GitHub releases page
In the case of .deb or .rpm, install them using your package manager
For compressed archives, unpack the file, and copy the grant binary to a folder in your path such as /usr/local/bin
Community builds of grant
Homebrew
brew tap anchore/grant
brew install grant
3.4 - Verifying Downloads
Verifying release assets after downloading
Why verify downloads?
Verifying your downloads ensures that:
The files haven’t been tampered with during transit
You’re installing authentic Anchore software
Your supply chain is secure from the start
All release artifacts include checksums, and the checksum file itself is cryptographically signed using cosign for verification.
Note
Installation scripts support automatic verification using the -v flag if you have cosign installed. This performs the same verification steps outlined below.
If you can’t use cosign, you can verify checksums manually. This verifies file integrity but not authenticity.
Security Note
Checksum verification only confirms the file hasn’t been corrupted. It doesn’t verify that the file is authentic. Use cosign verification when possible for better security.
Step 1: Download the files
Download your tool binary and the checksums file:
# Example for Syft v1.23.1wget https://github.com/anchore/syft/releases/download/v1.23.1/syft_1.23.1_darwin_arm64.tar.gz
wget https://github.com/anchore/syft/releases/download/v1.23.1/syft_1.23.1_checksums.txt
Learn how to create a Software Bill of Materials (SBOMs) for container images, filesystems, and archives using Syft.
An SBOM, or Software Bill of Materials, is a detailed list of all the components, libraries, and modules that make up a piece of software.
For a developer, having an SBOM is crucial for tracking dependencies, quickly identifying known vulnerabilities within those components, and ensuring license compliance.
For a consumer or organization using the software, an SBOM provides transparency into the software’s supply chain, allowing them to assess potential security risks and understand what’s “under the hood.”
Syft is an open-source command-line tool and Go library. Its primary function is to scan container images, file systems, and archives to automatically generate a Software Bill of Materials, making it easier to understand the composition of software.
4.1.1 - Getting Started
Use Syft to generate your first SBOM from container images, directories, or archives.
Syft is a CLI tool for generating a Software Bill of Materials (SBOM) from container images and filesystems.
Installation
Syft is provided as a single compiled executable. Run the command for your platform to download the latest release. The full list of official and community maintained packages can be found on the installation page.
curl -sSfL <https://get.anchore.io/syft> | sudo sh -s -- -b /usr/local/bin
brew install syft
nuget install Anchore.Syft
See the installation guide for more options including package managers and manual installation.
Display the contents of a public container image
Run syft against a small container image, which will be pulled from DockerHub. The output will be a simple human-readable table.
syft alpine:latest
The output will look similar to the following table.
NAME VERSION TYPE
alpine-baselayout 3.6.8-r1 apk
alpine-baselayout-data 3.6.8-r1 apk
alpine-keys 2.5-r0 apk
alpine-release 3.21.3-r0 apk
apk-tools 2.14.6-r3 apk
busybox 1.37.0-r12 apk
busybox-binsh 1.37.0-r12 apk
...
Learn more
Syft supports more than just containers. Learn more about Supported Sources
Create an industry-standard SBOM
This command will display the human-readable table and write SBOMs in both SPDX and CycloneDX formats, the two primary industry standards.
The same table will be displayed, and two SBOM files will be created in the current directory.
Learn more
Syft supports multiple SBOM output formats, find out more about Output Formats.
Examine the SBOM file contents
We can use jq to extract specific package data from the SBOM files (note: by default Syft outputs JSON on a single line,
but you can enable pretty-printing with the SYFT_FORMAT_PRETTY=true environment variable).
Both formats structure package information differently:
SPDX format:
jq '.packages[].name' alpine.spdx.json
CycloneDX format:
jq '.components[].name' alpine.cdx.json
Both commands show the packages that Syft found in the container image:
By default, Syft shows only software visible in the final container image (the “squashed” representation).
To include software from all image layers, regardless of its presence in the final image, use --scope all-layers:
syft <image> --scope all-layers
FAQ
Does Syft need internet access?
Only for downloading container images. By default, scanning works offline.
What about private container registries?
Syft supports authentication for private registries. See Private Registries.
Can I use Syft in CI/CD pipelines?
Absolutely! Syft is designed for automation. Generate SBOMs during builds and scan them for vulnerabilities.
What data does Syft send externally?
Nothing. Syft runs entirely locally and doesn’t send any data to external services.
Next steps
Now that you’ve generated your first SBOM, here’s what you can do next:
Scan for vulnerabilities: Use Grype to find security issues in your SBOMs
Check licenses: Learn about License Scanning to understand dependency licenses
Customize output: Explore different Output Formats for various tools and workflows
Scan different sources: Discover all Supported Sources Syft can analyze
Explore the different sources Syft can analyze including container images, OCI registries, directories, files, and archives.
Syft can generate an SBOM from a variety of sources including container images, directories, files, and archives.
In most cases, you can simply point Syft at what you want to analyze and it will automatically detect and catalog it correctly.
Catalog a container image from your local daemon or a remote registry:
syft alpine:latest
Catalog a directory (useful for analyzing source code or installed applications):
syft /path/to/project
Catalog a container image archive:
syft image.tar
To explicitly specify the source, use the --from flag:
--from ARG
Description
docker
Use images from the Docker daemon
podman
Use images from the Podman daemon
containerd
Use images from the Containerd daemon
docker-archive
Use a tarball from disk for archives created from docker save
oci-archive
Use a tarball from disk for OCI archives (from Skopeo or otherwise)
Read directly from a path on disk (any single file)
registry
Pull image directly from a registry (bypass any container runtimes)
Source-Specific Behaviors
Container Image Sources
When working with container images, Syft applies the following defaults and behaviors:
Registry: If no registry is specified in the image reference (e.g. alpine:latest instead of docker.io/alpine:latest), Syft assumes docker.io
Platform: For unspecific image references (tags) or multi-arch images pointing to an index (not a manifest), Syft analyzes the linux/amd64 manifest by default.
Use the --platform flag to target a different platform.
When you provide an image reference without specifying a source type (i.e. no --from flag), Syft attempts to resolve the image using the following sources in order:
Docker daemon
Podman daemon
Containerd daemon
Direct registry access
For example, when you run syft alpine:latest, Syft will first check your local Docker daemon for the image.
If Docker isn’t available, it tries Podman, then Containerd, and finally attempts to pull directly from the registry.
You can override this default behavior with the default-image-pull-source configuration option to always prefer a specific source.
See Configuration for more details.
Directory Sources
When you provide a directory path as the source, Syft recursively scans the directory tree to catalog installed software packages and files.
When you point Syft at a directory (especially system directories like /), it automatically skips certain filesystem types to improve
scan performance and avoid indexing areas that don’t contain installed software packages.
Filesystems always skipped
proc / procfs - Virtual filesystem for process information
sysfs - Virtual filesystem for kernel and device information
devfs / devtmpfs / udev - Device filesystems
Filesystems conditionally skipped
tmpfs filesystems are only skipped when mounted at these specific locations:
/dev - Device files
/sys - System information
/run and /var/run - Runtime data and process IDs
/var/lock - Lock files
These paths are excluded because they contain virtual or temporary runtime data rather than installed software packages.
Skipping them significantly improves scan performance and enables you to catalog entire system root directories without getting stuck scanning thousands of irrelevant entries.
Syft identifies these filesystems by reading your system’s mount table (/proc/self/mountinfo on Linux).
When a directory matches one of these criteria, the entire directory tree under that mount point is skipped.
File types excluded
These file types are never indexed during directory scans:
Character devices
Block devices
Sockets
FIFOs (named pipes)
Irregular files
Regular files, directories, and symbolic links are always processed.
Archive Sources
Syft automatically detects and unpacks common archive formats, then catalogs their contents.
If an archive is a container image archive (from docker save or skopeo copy), Syft treats it as a container image.
Supported archive formats:
Standard archives:
.zip
.tar (uncompressed)
.rar (read-only extraction)
Compressed tar variants:
.tar.gz / .tgz
.tar.bz2 / .tbz2
.tar.br / .tbr (brotli)
.tar.lz4 / .tlz4
.tar.sz / .tsz (snappy)
.tar.xz / .txz
.tar.zst / .tzst (zstandard)
Standalone compression formats (extracted if containing tar):
.gz (gzip)
.bz2 (bzip2)
.br (brotli)
.lz4
.sz (snappy)
.xz
.zst / .zstd (zstandard)
OCI Archives and Layout Sources
Syft automatically detects OCI archive and directory structures (including OCI layouts and SIF files) and catalogs them accordingly.
OCI archives and layouts are particularly useful for CI/CD pipelines, as they allow you to catalog images, scan for vulnerabilities, or perform other checks without publishing to a registry. This provides a powerful pattern for build-time gating.
When using container runtime sources (Docker, Podman, or Containerd):
Missing images: If an image doesn’t exist locally in the container runtime, Syft attempts to pull it from the registry via the runtime
Private images: You must be logged in to the registry via the container runtime (e.g., docker login) or have credentials configured for direct registry access. See Authentication with Private Registries for more details.
Environment Variables
Syft respects the following environment variables for each container runtime:
Source
Environment Variables
Description
Docker
DOCKER_HOST
Docker daemon socket/host address (supports ssh:// for remote connections)
DOCKER_TLS_VERIFY
Enable TLS verification (auto-sets DOCKER_CERT_PATH if not set)
DOCKER_CERT_PATH
Path to TLS certificates (defaults to ~/.docker if DOCKER_TLS_VERIFY is set)
DOCKER_CONFIG
Override default Docker config directory
Podman
CONTAINER_HOST
Podman socket/host address (e.g., unix:///run/podman/podman.sock or ssh://user@host/path/to/socket)
CONTAINER_SSHKEY
SSH identity file path for remote Podman connections
Configured via CONTAINER_HOST, CONTAINER_SSHKEY, and CONTAINER_PASSPHRASE environment variables
Used for remote Podman instances
Direct Registry Access
The registry source bypasses container runtimes entirely and pulls images directly from the registry.
Credentials are resolved in the following order:
Syft first attempts to use default Docker credentials from ~/.docker/config.json if they exist
If default credentials are not available, you can provide credentials via environment variables. See Authentication with Private Registries for more details.
4.1.3 - Output Formats
Choose from multiple SBOM output formats including SPDX, CycloneDX, and Syft’s native JSON format.
Syft supports multiple output formats to fit different workflows and requirements by using the -o (or --output) flag:
syft <image> -o <format>
Available formats
Syft-native formats
-o ARG
Description
table
A columnar summary (default)
json
Native output for Syft—use this to get as much information out of Syft as possible! (see the JSON schema)
Some output formats support multiple schema versions. Specify a version by appending @<version> to the format name:
syft <source> -o <format>@<version>
Examples:
# Use CycloneDX JSON version 1.4syft <source> -o cyclonedx-json@1.4
# Use SPDX JSON version 2.2syft <source> -o spdx-json@2.2
# Default to latest version if not specifiedsyft <source> -o cyclonedx-json
Formats with version support:
cyclonedx-json: 1.2, 1.3, 1.4, 1.5, 1.6
cyclonedx-xml: 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6
spdx-json: 2.2, 2.3
spdx-tag-value: 2.1, 2.2, 2.3
When no version is specified, Syft uses the latest supported version of the format.
Learn how to work with Syft’s native JSON format including querying with jq, extracting metadata, and understanding the SBOM structure.
Syft’s native JSON format provides the most comprehensive view of discovered software components, capturing all package metadata, file details, relationships, and source information.
{
"artifacts": [], // Package nodes discovered
"artifactRelationships": [], // Edges between packages and files
"files": [], // File nodes discovered
"source": {}, // What was scanned (the image, directory, etc.)
"distro": {}, // Linux distribution discovered
"descriptor": {}, // Syft version and configuration that captured this SBOM
"schema": {} // Schema version
}
Package (artifacts)
A software package discovered by Syft (library, application, OS package, etc.).
{
"id": "74d9294c42941b37", // Unique identifier for this package that is content addressable
"name": "openssl",
"version": "1.1.1k",
"type": "apk", // Package ecosystem (apk, deb, npm, etc.)
"foundBy": "apk-cataloger",
"locations": [
// Paths used to populate information on this package object
{
"path": "/lib/apk/db/installed", // Always the real-path
"layerID": "sha256:...",
"accessPath": "/lib/apk/db/installed", // How Syft accessed the file (may be a symlink)
"annotations": {
"evidence": "primary"// Qualifies the kind of evidence extracted from this location (primary, supporting)
}
}
],
"licenses": [
{
"value": "Apache-2.0", // Raw value discovered
"spdxExpression": "Apache-2.0", // Normalized SPDX expression of the discovered value
"type": "declared", // "declared", "concluded", or "observed"
"urls": ["https://..."],
"locations": [] // Where license was found
}
],
"language": "c",
"cpes": [
{
"cpe": "cpe:2.3:a:openssl:openssl:1.1.1k:*:*:*:*:*:*:*",
"source": "nvd-dictionary"// Where the CPE was derived from (nvd-dictionary or syft-generated)
}
],
"purl": "pkg:apk/alpine/openssl@1.1.1k",
"metadata": {} // Ecosystem-specific fields (varies by type)
}
File
A file found on disk or referenced in package manager metadata.
{
"id": "def456",
"location": {
"path": "/usr/bin/example",
"layerID": "sha256:..."// For container images
},
"metadata": {
"mode": 493, // File permissions in octal
"type": "RegularFile",
"mimeType": "application/x-executable",
"size": 12345// Size in bytes
},
"digests": [
{
"algorithm": "sha256",
"value": "abc123..." }
],
"licenses": [
{
"value": "Apache-2.0", // Raw value discovered
"spdxExpression": "Apache-2.0", // Normalized SPDX expression of the discovered value
"type": "declared", // "declared", "concluded", or "observed"
"evidence": {
"confidence": 100,
"offset": 1234, // Byte offset in file
"extent": 567// Length of match
}
}
],
"executable": {
"format": "elf", // "elf", "pe", or "macho"
"hasExports": true,
"hasEntrypoint": true,
"importedLibraries": [
// Shared library dependencies
"libc.so.6",
"libssl.so.1.1" ],
"elfSecurityFeatures": {
// ELF binaries only
"symbolTableStripped": false,
"stackCanary": true, // Stack protection
"nx": true, // No-Execute bit
"relRO": "full", // Relocation Read-Only
"pie": true// Position Independent Executable
}
}
}
Relationship
Connects any two nodes (package, file, or source) with a typed relationship.
{
"parent": "package-id", // Package, file, or source ID
"child": "file-id",
"type": "contains"// contains, dependency-of, etc.
}
Source
Information about what was scanned (container image, directory, file, etc.).
The path field always contains the real path after resolving symlinks, while accessPath shows how Syft accessed the file (which may be through a symlink).
The evidence annotation indicates whether this location was used to discover the package (primary) or contains only auxiliary information (supporting).
Descriptor
Syft version and configuration used to generate this SBOM.
jq is a command-line tool for querying and manipulating JSON.
The following examples demonstrate practical queries for working with Syft JSON output.
Find packages by name pattern:
Uses regex pattern matching to find security-critical packages
.artifacts[] | select(.name | test("^(openssl|ssl|crypto)")) |# Regex pattern match on package name {
name,
version,
type# Package type (apk, deb, rpm, etc.) }
Provides a summary count of packages per ecosystem
[.artifacts[]] | group_by(.type) |# Group packages by ecosystem typemap({
type: .[0].type,
count: length # Count packages in each group }) | sort_by(.count) | reverse # Highest count first
Configure which package catalogers Syft uses to discover software components including language-specific and file-based catalogers.
TL;DR
Syft automatically picks the right catalogers for you (recommended for most users)
Scanning a container image? Finds installed packages (like Python packages in site-packages)
Scanning a directory? Finds both installed packages and declared dependencies (like requirements.txt)
Want to customize? Use --select-catalogers to filter, add, or remove catalogers
Need complete control? Use --override-default-catalogers to replace all defaults
Catalogers are Syft’s detection modules that identify software packages in your projects.
Each cataloger specializes in finding specific types of packages—for example, python-package-cataloger finds Python dependencies declared in requirements.txt,
while python-installed-package-cataloger finds Python packages that have already been installed.
Syft includes dozens of catalogers covering languages like Python, Java, Go, JavaScript, Ruby, Rust, and more, as well as OS packages (APK, RPM, DEB) and binary formats.
Default Behavior
Syft uses different cataloger sets depending on what you’re scanning:
Scan Type
Default Catalogers
What They Find
Example
Container Image
Image-specific catalogers
Installed packages only
Python packages in site-packages
Directory
Directory-specific catalogers
Installed packages + declared dependencies
Python packages in site-packages AND requirements.txt
This behavior ensures accurate results across different contexts. When you scan an image, Syft assumes installation steps
have completed –this way you are getting results for software that is positively present.
When you scan a directory (like a source code repository), Syft looks for both what’s installed and what’s declared as
a dependency –this way you are getting results for not only what’s installed but also what you intend to install.
Why use different catalogers for different sources?
Most of the time, files that hint at the intent to install software do not have enough information in them to determine the exact version of the package that would be installed.
For example, a requirements.txt file might specify a package without a version, or with a version range.
By looking at installed packages in an image, after any build tooling has been invoked, Syft can provide more accurate version information.
$ syft <source-directory> --select-catalogers python
# Uses: python-installed-package-cataloger, python-package-cataloger# Finds: Packages in site-packages + requirements.txt, setup.py, Pipfile, etc.
Viewing Active Catalogers
The most reliable way to see which catalogers Syft used is to check the SBOM itself. Every SBOM captures both the catalogers that were requested and those that actually ran:
This shows what catalogers were attempted, not just what found packages. The requested field shows your cataloger selection strategy, while used lists every cataloger that ran.
You can also see cataloger activity in real-time using verbose logging, though this is less comprehensive and not as direct.
Exploring Available Catalogers
Use the syft cataloger list command to see all available catalogers, their tags, and test selection expressions.
List all catalogers
syft cataloger list
Output shows file and package catalogers with their tags:
You need catalogers from both image and directory sets
You want to use catalogers that aren’t in the default set
You need precise control regardless of scan type
Warning
Overriding defaults can lead to incomplete or inaccurate results if you don’t include all necessary catalogers. Use --select-catalogers for most cases.
Examples by Use Case
Filtering to Specific Languages
Scan for only Python packages using defaults for your scan type:
syft <target> --select-catalogers python
Scan for only Java and Go packages:
syft <target> --select-catalogers java,go
Adding Catalogers
Use defaults and also include the SBOM cataloger (which finds embedded SBOMs):
syft <target> --select-catalogers +sbom-cataloger
Scan with defaults plus both SBOM and binary catalogers:
Scan for Go packages, always include SBOM cataloger, but exclude binary analysis:
$ syft <container-image> --select-catalogers go,+sbom-cataloger,-binary
# Result: go-module-binary-cataloger, sbom-cataloger# (binary cataloger excluded even though it's in go tag)
Check which catalogers ran and whether they found packages:
# See which catalogers were used$ syft <target> -o json | jq '.descriptor.configuration.catalogers.used'# See which catalogers found packages$ syft <target> -o json | jq '.artifacts[].foundBy'# See packages found by a specific cataloger$ syft <target> -o json | jq '.artifacts[] | select(.foundBy == "python-package-cataloger") | .name'
If your expected cataloger isn’t in the used list:
Verify the cataloger exists for your scan type: Use syft cataloger list --select-catalogers <tag> to preview
Check your selection expressions: You may have excluded it with - or not included it in your filter
Check file locations: Some catalogers look for specific paths (e.g., site-packages for Python)
If the cataloger ran but found nothing, check that:
Package files exist in the scanned source
Files are properly formatted
Files are in the expected locations for that cataloger
How do I know if I’m using image or directory defaults?
Name: The unique identifier for a single cataloger (e.g., python-package-cataloger)
Tag: A label that groups multiple catalogers (e.g., python includes both python-package-cataloger and python-installed-package-cataloger)
Use tags when you want to downselect from the default catalogers, and names when you need to target a specific cataloger.
Why use –select-catalogers vs –override-default-catalogers?
--select-catalogers: Respects Syft’s automatic image/directory behavior, safer for most use cases
--override-default-catalogers: Ignores scan type, gives complete control, requires more knowledge
When in doubt, use --select-catalogers.
Technical Reference
For reference, here’s the formal logic Syft uses for cataloger selection:
image_catalogers = all_catalogers AND catalogers_tagged("image")
directory_catalogers = all_catalogers AND catalogers_tagged("directory")
default_catalogers = image_catalogers OR directory_catalogers
sub_selected_catalogers = default_catalogers INTERSECT catalogers_tagged(TAG) [ UNION sub_selected_catalogers ... ]
base_catalogers = default_catalogers OR sub_selected_catalogers
final_set = (base_catalogers SUBTRACT removed_catalogers) UNION added_catalogers
This logic applies when using --select-catalogers. The --override-default-catalogers flag bypasses the default cataloger selection entirely and starts with the specified catalogers instead.
4.1.6 - File Selection
Control which files and directories Syft includes or excludes when generating SBOMs.
By default, Syft catalogs file details and digests for files owned by discovered packages. You can change this behavior using the SYFT_FILE_METADATA_SELECTION environment variable or the file.metadata.selection configuration option.
Available options:
all: capture all files from the search space
owned-by-package: capture only files owned by packages (default)
none: disable file information capture
Excluding file paths
You can exclude specific files and paths from scanning using glob patterns with the --exclude parameter. Use multiple --exclude flags to specify multiple patterns.
# Exclude a specific directorysyft <source> --exclude /etc
# Exclude files by patternsyft <source> --exclude './out/**/*.json'# Combine multiple exclusionssyft <source> --exclude './out/**/*.json' --exclude /etc --exclude '**/*.log'
Tip
Always wrap glob patterns in single quotes to prevent your shell from expanding wildcards:
syft <source> --exclude '**/*.json'# Correctsyft <source> --exclude **/*.json # May not work as expected
Exclusion behavior by source type
How Syft interprets exclusion patterns depends on whether you’re scanning an image or a directory.
Image scanning
When scanning container images, Syft scans the entire filesystem. Use absolute paths for exclusions:
# Exclude system directoriessyft alpine:latest --exclude /etc --exclude /var
# Exclude files by pattern across entire filesystemsyft alpine:latest --exclude '/usr/**/*.txt'
Directory scanning
When scanning directories, Syft resolves exclusion patterns relative to the specified directory. All exclusion patterns must begin with ./, */, or **/.
# Scanning /usr/foosyft /usr/foo --exclude ./package.json # Excludes /usr/foo/package.jsonsyft /usr/foo --exclude '**/package.json'# Excludes all package.json files under /usr/foosyft /usr/foo --exclude './out/**'# Excludes everything under /usr/foo/out
Path prefix requirements for directory scans:
Pattern
Meaning
Example
./
Relative to scan directory root
./config.json
*/
One level of directories
*/temp
**/
Any depth of directories
**/node_modules
Note
When scanning directories, you cannot use absolute paths like /etc or /usr/**/*.txt. The pattern must begin with ./, */, or **/ to be resolved relative to your specified scan directory.
Create custom SBOM output formats using Go templates with available data fields to build tailored reports for specific tooling or compliance requirements.
Syft lets you define custom output formats using Go templates. This is useful for generating custom reports, integrating with specific tools, or extracting only the data you need.
How to use templates
Set the output format to template and specify the template file path:
Templates receive the same data structure as the syft-json output format. The Syft JSON schema is the source of truth for all available fields and their structure.
To see what data is available:
# View the full JSON structuresyft <image> -o json
# Explore specific fieldssyft <image> -o json | jq '.artifacts[0]'
Key fields commonly used in templates:
.artifacts - Array of discovered packages
.files - Array of discovered files
.source - Information about what was scanned
.distro - Detected Linux distribution (if applicable)
If you have templates from before Syft v0.102.0 that no longer work, set format.template.legacy: true in your configuration. This uses internal Go structs instead of the JSON output schema.
Long-term support for this legacy option is not guaranteed.
Convert existing SBOMs between different formats including SPDX and CycloneDX using Syft’s experimental conversion capabilities.
Experimental Feature
This feature is experimental and may change in future releases.
The ability to convert existing SBOMs means you can create SBOMs in different formats quickly, without the need to regenerate the SBOM from scratch, which may take significantly more time.
We support formats with wide community usage AND good encode/decode support by Syft. The supported formats are:
Syft JSON (-o json)
SPDX JSON (-o spdx-json)
SPDX tag-value (-o spdx-tag-value)
CycloneDX JSON (-o cyclonedx-json)
CycloneDX XML (-o cyclonedx-xml)
Conversion example:
syft alpine:latest -o syft-json=sbom.syft.json # generate a syft SBOMsyft convert sbom.syft.json -o cyclonedx-json=sbom.cdx.json # convert it to CycloneDX
Best practices
Use Syft JSON as the source format
Generate and keep Syft JSON as your primary SBOM. Convert from it to other formats as needed:
# Generate Syft JSON (native format with complete data)syft <source> -o json=sbom.json
# Convert to other formatssyft convert sbom.json -o spdx-json=sbom.spdx.json
syft convert sbom.json -o cyclonedx-json=sbom.cdx.json
Converting between non-Syft formats loses data. Syft JSON contains all information Syft extracted, while other formats use different schemas that can’t represent the same fields.
Converting between formats may lose data. Packages (names, versions, licenses) transfer reliably, while tool metadata, source details, and format-specific fields may not. Use Syft JSON as the source format to minimize data loss.
Conversions from Syft JSON to SPDX or CycloneDX preserve all standard SBOM fields. Converted output matches directly-generated output (only timestamps and IDs differ).
Avoid chaining conversions (e.g., SPDX → CycloneDX). Each step may lose format-specific data.
Generate cryptographically signed SBOM attestations using in-toto and Sigstore to create, verify, and attach attestations to container images for supply chain security.
Experimental Feature
This feature is experimental and may change in future releases.
Overview
An attestation is cryptographic proof that you created a specific SBOM for a container image. When you publish an image, consumers need to trust that the SBOM accurately describes the image contents. Attestations solve this by letting you sign SBOMs and attach them to images, enabling consumers to verify authenticity.
Syft supports two approaches:
Keyless attestation: Uses your identity (GitHub, Google, Microsoft) as trust root via Sigstore. Best for CI/CD and teams.
Local key attestation: Uses cryptographic key pairs you manage. Best for air-gapped environments or specific security requirements.
Write access to the OCI registry where you’ll publish attestations
Registry authentication configured (e.g., docker login for Docker Hub)
For local key attestations, you’ll also need a key pair. Generate one with:
cosign generate-key-pair
This creates cosign.key (private key) and cosign.pub (public key). Keep the private key secure.
Keyless attestation
Keyless attestation uses Sigstore to tie your OIDC identity (GitHub, Google, or Microsoft account) to the attestation. This eliminates key management overhead.
Create a keyless attestation
syft attest --output cyclonedx-json <IMAGE>
Replace <IMAGE> with your image reference (e.g., docker.io/myorg/myimage:latest). You must have write access to this image.
What happens:
Syft opens your browser to authenticate via OIDC (GitHub, Google, or Microsoft)
After authentication, Syft generates the SBOM
Sigstore signs the SBOM using your identity
The attestation is uploaded to the OCI registry alongside your image
Verification for docker.io/myorg/myimage:latest --
The following checks were performed on each of these signatures:
- The cosign claims were validated
- The signatures were verified against the specified public key
- Any certificates were verified against the Fulcio roots.
This ensures you’re scanning a verified, trusted SBOM.
Troubleshooting
Authentication failures
Ensure you’re logged into the registry: docker login <registry>
Verify you have write access to the image repository
Cosign version errors
Update to cosign ≥ v1.12.0: cosign version
Verification failures
For keyless: ensure COSIGN_EXPERIMENTAL=1 is set
For key-based: verify you’re using the correct public key
Check the attestation type matches (--type spdxjson or --type cyclonedx-json)
Permission denied uploading attestations
Verify write access to the registry
Check authentication credentials are current
Ensure the image exists in the registry before attaching attestations
4.2 - Vulnerability Scanning
Learn how to scan container images, filesystems, and SBOMs for known software vulnerabilities.
Vulnerability scanning is the automated process of proactively identifying security weaknesses and known exploits within software and systems. This is crucial because it helps developers and organizations find and fix potential security holes before malicious actors can discover and exploit them, thus protecting data and maintaining system integrity.
Grype is an open-source vulnerability scanner specifically designed to analyze container images and filesystems. It works by comparing the software components it finds against a database of known vulnerabilities, providing a report of potential risks so they can be addressed.
4.2.1 - Getting Started
Vulnerability Scanning Getting Started
Introduction
Grype is an easy-to-integrate open source vulnerability scanning tool for container images and filesystems.
Install the latest Grype release
Grype is provided as a single compiled executable. Issue the command for your platform to download the latest release of Grype. The full list of official and community maintained packages can be found on the installation page.
curl -sSfL <https://get.anchore.io/grype> | sudo sh -s -- -b /usr/local/bin
brew install grype
nuget install Anchore.Grype
Once installed, ensure the grype binary is in the PATH for your system.
Scan a container for vulnerabilities
grype <image>
Scan a public container image for vulnerabilities
Run grype with default options against a small container, which will be pulled from DockerHub. Grype will also download the latest vulnerability database. The output will be a simple human-readable table.
Grype can scan containers directly, but it can also scan an existing SBOM document.
Note
This presumes you already created alpine_latest-spdx.json using Syft, or some other tool. If not, go to SBOM Generation Getting Started and create it now.
grype alpine_latest-spdx.json
Grype should give similar output to the previous table.
Create a vulnerability report in JSON format
The JSON-formatted output from Grype may be processed or visualized by other tools.
Create the vulnerability report using the --output, and via jq to make it prettier.
Grype uses a locally cached database of known vulnerabilities when searching a container, directory, or SBOM for security vulnerabilities. Anchore collates vulnerability data from common feeds, and publishes that data online, at no cost to users.
When Grype is launched, it checks for an existing vulnerability database, and looks for an updated one online. If available, Grype will automatically download the new database.
Users can manage the locally cached database with the grype db command:
Check and update the database
Manually checking for updates shouldn’t be necessary, due to Grype automatically doing this on launch. However, it is possible to force Grype to look for an updated vulnerability database.
grype db check
A message will indicate if no updates are available since the last download.
Installed DB version v6.0.2 was built on 2025-05-08T04:08:40Z
No update available
If the database is outdated, a message such as this will be displayed.
Installed DB version v6.0.2 was built on 2025-05-07T04:08:13Z
Updated DB version v6.0.2 was built on 2025-05-08T04:08:40Z
You can run 'grype db update' to update to the latest db
[0000] ERROR db upgrade available
grype db update
A short animation will show progress of downloading, uncompressing and hydrating (creating indexes on) the database. Then a message reporting the successful update will be displayed.
grype db update
✔ Vulnerability DB [updated]
Vulnerability database updated to latest version!
Learn about the vulnerability data sources Grype uses for matching
Grype matches vulnerabilities by comparing package information from your software against vulnerability databases. Grype sources these databases from multiple upstream providers, each covering different operating systems and programming language ecosystems. This page documents each data source, what it covers, and how Grype interprets the data.
The National Vulnerability Database (NVD) provides Common Vulnerabilities and Exposures (CVE) data that supplements ecosystem-specific sources. Grype uses the NVD CVE API 2.0 (Vunnel provider: nvd) to access vulnerability information across all ecosystems using Common Platform Enumeration (CPE) matching.
GitHub Security Advisories provides vulnerability data for multiple language ecosystems:
Composer (PHP) → composer
Dart → dart
Go → go
Java (Maven) → java
npm (JavaScript) → npm
NuGet (.NET) → nuget
Python (PyPI) → python
Ruby (RubyGems) → gem
Rust (crates.io) → rust
Swift → swift
GitHub Actions → github-action
How it works:
Grype retrieves vulnerability data from GitHub’s GraphQL Application Programming Interface (API). Each advisory includes a GitHub Security Advisories ID (GHSA-xxx) and may include associated CVE identifiers. The data includes both general vulnerabilities and malware classifications.
The provider downloads advisories in batches of 100 per GraphQL request and handles GitHub’s rate limiting by pausing when fewer than 10 API requests remain. For incremental updates, the provider uses an updatedSince timestamp parameter to fetch only advisories modified since the last update.
Assumptions and interpretation:
Severity mapping: GitHub provides four severity levels that map directly to Grype’s normalized scale:
LOW → Low
MODERATE → Medium
HIGH → High
CRITICAL → Critical
Version matching: GitHub provides version ranges in ecosystem-specific formats. For example, npm packages use semantic versioning (semver) syntax, while Python packages use PEP 440 version specifiers. Grype interprets these ranges according to each ecosystem’s version comparison rules.
CVSS scores: When available, Grype extracts and validates CVSS vector strings from the advisory data to provide detailed vulnerability scoring information.
Authentication: GitHub requires a personal access token for API access. Without proper authentication, data retrieval fails.
The Bitnami Vulnerability Database contains vulnerability information for applications packaged by Bitnami. The data covers various language ecosystems and is stored in Open Source Vulnerability (OSV) format version 1.5.0.
How it works:
Grype clones the Bitnami VulnDB Git repository from the main branch and processes the OSV-formatted vulnerability records.
Assumptions and interpretation:
Data format: All vulnerability records follow the OSV schema, which provides a standardized structure for vulnerability information across different ecosystems.
Scope: The database focuses on vulnerabilities affecting Bitnami-packaged applications, which may include both upstream vulnerabilities and Bitnami-specific issues.
Additional development and end-of-life releases are also supported.
How it works:
Grype clones the Ubuntu CVE Tracker Git repository and parses the tracking files that document vulnerability status for each Ubuntu release. The tracker includes patch states that indicate whether a package is vulnerable, fixed, or not affected.
For end-of-life Ubuntu releases, Grype examines the repository’s revision history to determine the final patch states before support ended.
Assumptions and interpretation:
Severity mapping: Ubuntu uses a six-level severity scale that maps to Grype’s normalized levels:
Untriaged → Unknown
Negligible → Negligible
Low → Low
Medium → Medium
High → High
Critical → Critical
Patch states: Ubuntu tracks vulnerabilities with several patch states:
DNE (Does Not Exist) → Package not affected because it doesn’t exist in this release
needs-triage → Vulnerability confirmed but not yet assessed
needed → Vulnerable, no fix available yet
released → Vulnerable, fix available at specified version
pending → Fix prepared but not yet released
active → Vulnerability being actively worked on
ignored → Vulnerability acknowledged but deliberately not fixed (not considered vulnerable for matching purposes)
Version format: Ubuntu uses dpkg version comparison rules for determining whether a package version is affected.
End-of-life handling: For releases that have reached end-of-life, Grype merges patch states from the repository’s revision history to capture the final vulnerability status.
Fix availability: When a patch state indicates released, Grype extracts the fix version from the tracking data. A fix version of “None” means the package is vulnerable with no fix available.
Grype retrieves vulnerability data from two Debian sources: a JSON feed from the Debian Security Tracker and Debian Security Advisory (DSA) lists. The provider combines information from both sources to build a complete picture of vulnerabilities affecting Debian packages.
Assumptions and interpretation:
Severity mapping: Debian uses an urgency-based severity system with some special notations:
unimportant → Negligible
low, low** → Low
medium, medium** → Medium
high, high** → High
When Debian doesn’t provide severity information, Grype falls back to NVD severity data if available
Version format: Debian uses dpkg version comparison rules, the same as Ubuntu.
Special version handling: A fix version of “0” indicates the package is not vulnerable in that particular Debian release.
Advisory metadata: When a DSA (Debian Security Advisory) exists for a vulnerability, Grype includes the DSA identifier and provides a link to the advisory.
Legacy data support: The provider can also process data from Debian’s previous feed service format to maintain historical vulnerability records.
Alpine Linux 3.2 and newer, plus the edge (development) branch.
How it works:
Grype downloads YAML files from Alpine’s Security Database (SecDB) for each supported Alpine release. Each release has separate databases for the main and community package repositories. The provider parses the “secfixes” sections that map package versions to the CVE identifiers they fix.
Assumptions and interpretation:
Severity: Alpine’s SecDB does not include severity ratings in the source data. All Alpine vulnerabilities show as “Unknown” severity unless supplemented by data from other sources like NVD.
Version format: Alpine uses apk package version comparison rules.
Database types: Alpine maintains two package databases:
main → Core Alpine packages
community → Community-maintained packages
Note: Alpine 3.2 does not have a community database (community repository support was added in 3.3).
Fix mapping: The secfixes section lists package versions and the CVE IDs they address. When a package version includes a fix for a CVE, Grype considers that version and all later versions non-vulnerable.
Red Hat Enterprise Linux 5, 6, 7, 8, 9 (RHEL 3 and 4 are skipped by default)
How it works:
Grype retrieves vulnerability data from Red Hat’s Common Vulnerabilities and Exposures (CVE) summary Application Programming Interface (API) and supplements it with detailed information from either Common Security Advisory Framework (CSAF) or Open Vulnerability and Assessment Language (OVAL) sources. You can configure which advisory format to use.
The provider performs a minimal initial download of CVE summaries, then fetches full CVE details only for relevant vulnerabilities. To avoid excessive API calls, the provider performs full synchronization at a configurable interval (default: 2 days) and uses incremental updates between full syncs.
Assumptions and interpretation:
RHSA source options: Red Hat Security Advisories (RHSA) are available in two formats:
CSAF (Common Security Advisory Framework) → Structured JSON format
OVAL (Open Vulnerability and Assessment Language) → XML format for automated assessment
Version format: Red Hat uses RPM version comparison rules.
Extended Update Support (EUS): The provider handles EUS versions, which receive extended security updates beyond the normal RHEL lifecycle.
Parallel processing: By default, Grype processes Red Hat data using 4 parallel workers to improve performance during large synchronizations.
Grype retrieves Amazon Linux Security Advisories (ALAS) from RSS feeds maintained for each Amazon Linux version. The provider parses the RSS feed to get advisory summaries, then scrapes the HTML pages for detailed package and vulnerability information.
Due to occasional HTTP 403 errors when accessing advisory pages, the provider tolerates up to 25 such errors by default before failing.
Assumptions and interpretation:
Severity mapping: Amazon Linux uses four severity levels:
low → Low
medium → Medium
important → High
critical → Critical
Version format: Amazon Linux uses RPM version comparison rules.
RSS feeds: Each Amazon Linux version has its own RSS feed URL:
Grype downloads a compressed OVAL XML file that contains all Enterprise Linux Security Advisories (ELSA) for Oracle Linux. The provider parses this XML file to extract vulnerability and package information.
Assumptions and interpretation:
Severity mapping: Oracle Linux uses a five-level severity scale:
n/a → Negligible
low → Low
moderate → Medium
important → High
critical → Critical
Version format: Oracle Linux uses RPM version comparison rules.
Ksplice filtering: The provider filters out packages related to Ksplice (Oracle’s kernel live-patching technology) because these packages are not fully supported for vulnerability matching.
OVAL format: The data comes from a single compressed XML file: https://linux.oracle.com/security/oval/com.oracle.elsa-all.xml.bz2
SUSE Linux Enterprise Server 11, 12, 15 (configurable, defaults to these three versions)
How it works:
Grype downloads OVAL XML files from SUSE’s FTP server, with one file per major SLES version. Each file contains vulnerability definitions and affected package information.
Assumptions and interpretation:
Severity mapping: SUSE uses multiple terms that map to Grype’s normalized levels:
low → Low
moderate → Medium
medium → Medium
high → High
important → High
critical → Critical
Version format: SUSE uses RPM version comparison rules.
URL template: OVAL files follow this pattern: https://ftp.suse.com/pub/projects/security/oval/suse.linux.enterprise.server.{version}.xml.bz2
Rocky Linux is a community enterprise operating system designed to be downstream compatible with Red Hat Enterprise Linux.
How it works:
Grype fetches vulnerability data for Rocky Linux from the Rocky Linux Apollo API, which provides records in Open Source Vulnerability (OSV) format.
Assumptions and interpretation:
Data format: All vulnerability records follow the OSV schema.
Ecosystem normalization: The provider normalizes ecosystem identifiers from the OSV format. For example, “Rocky Linux:8” becomes “rocky:8” for consistency with Grype’s internal ecosystem naming.
Version format: Rocky Linux uses RPM version comparison rules.
CBL-Mariner
Data source: Microsoft CBL-Mariner OVAL
Vunnel provider:mariner
Supported versions:
CBL-Mariner 1.0, 2.0, 3.0
What it covers:
CBL-Mariner (Common Base Linux) is Microsoft’s internal Linux distribution, also available as an open source project.
How it works:
Grype downloads OVAL XML files for CBL-Mariner and parses them using the xsdata library. The provider processes rpminfo_test, rpminfo_object, and rpminfo_state elements to extract vulnerability and package information.
Assumptions and interpretation:
Version format: CBL-Mariner uses RPM version comparison rules.
OVAL processing: The provider handles standard OVAL XML structures to identify which package versions are affected by vulnerabilities.
The National Vulnerability Database provides comprehensive Common Vulnerabilities and Exposures (CVE) data across all ecosystems. Unlike ecosystem-specific providers, NVD uses Common Platform Enumeration (CPE) matching to identify vulnerable software.
How it works:
Grype retrieves CVE data from the NVD API 2.0, which provides up to 2000 results per request. For initial synchronization, the provider downloads all CVEs. For subsequent updates, it uses the last modified timestamp to fetch only CVEs that changed since the previous update.
The provider caches input data in a SQLite database to improve performance across runs. It supports retry logic (10 retries by default) to handle transient API failures.
Assumptions and interpretation:
CPE matching: NVD identifies vulnerable software using CPE (Common Platform Enumeration) identifiers. A CPE describes a software product with vendor, product name, version, and other attributes. Grype matches packages against CPE patterns to determine vulnerability status.
Incremental updates: The provider uses lastModStartDate and lastModEndDate parameters to fetch only CVEs modified within a specific time range, reducing API calls and bandwidth.
API rate limits: NVD enforces rate limits on API requests. You can provide an API key to enable higher rate limits. Without an API key, you’re limited to the public rate limit.
Fix date enrichment: NVD data often lacks information about when fixes became available. Grype supplements NVD data with fix dates from external databases when available, improving the accuracy of vulnerability timelines.
CPE configuration overrides: The provider supports custom CPE configurations that can override or supplement the default CPE matching data from NVD.
Publication date ranges: When querying by publication date, the API enforces a maximum 120-day range per request. The provider automatically splits larger date ranges into multiple requests.
Relationship to other providers:
NVD serves as a cross-cutting data source that complements ecosystem-specific providers. When an ecosystem-specific provider lacks severity information (such as Alpine), Grype can fall back to NVD severity ratings. NVD is also essential for CVE-only lookups where you need to check for a specific CVE across all ecosystems.
Because NVD uses CPE matching rather than package manager metadata, it can identify vulnerabilities in software that doesn’t come from a package manager. However, ecosystem-specific sources typically provide more accurate and granular information for their respective ecosystems by using native package version information.
Common patterns across providers
Severity normalization
All vulnerability providers map their severity ratings to a common scale that Grype uses for reporting:
Unknown → Severity information not available
Negligible → Minimal or no practical impact
Low → Limited impact, typically requiring complex exploit conditions
Medium → Moderate impact, may require specific conditions
High → Serious impact, relatively easy to exploit
Critical → Severe impact, easily exploitable, or widespread effect
Different providers use different severity scales in their source data. For example, Amazon Linux uses “important” while Oracle Linux uses “important” with the same meaning (both map to High). Grype normalizes these provider-specific terms to ensure consistent severity reporting across all data sources.
When a provider doesn’t include severity information in their data, Grype may fall back to NVD severity ratings if available.
Version matching
Version matching rules depend on the package format:
DEB-based systems (Ubuntu, Debian):
These systems use dpkg version comparison rules, which handle Debian-specific version components like epochs and revisions. For example, 1:2.0-1 has an epoch of 1, making it newer than 2.0-1 despite appearing lower numerically.
RPM-based systems (RHEL, Amazon, Oracle, SLES, Mariner, Alma, Rocky):
These systems use RPM version comparison rules, which compare version strings segment by segment. RPM versions can include release numbers and distribution tags. For example, 1.2.3-4.el8 includes version 1.2.3, release 4, and distribution tag el8.
APK-based systems (Alpine, Wolfi, Chainguard):
These systems use Alpine package version rules, which follow a simpler numeric comparison scheme with support for suffix modifiers like -r1 for package revisions.
Language packages (GitHub):
Language ecosystems use their own version comparison rules:
npm uses semantic versioning (semver) with ranges like >=1.2.3 <2.0.0
Python uses PEP 440 version specifiers with ranges like >=1.2,<2.0
Ruby uses RubyGems version comparison
Maven uses Maven version ordering rules
Each ecosystem has its own syntax for expressing version ranges, and Grype interprets these ranges according to the ecosystem’s version comparison semantics.
Fix date enrichment
Many providers supplement vulnerability records with “fix available” dates, which indicate when a vulnerability fix first became available. This information establishes accurate vulnerability timelines.
Grype uses external databases (called “fixdaters”) to determine fix availability dates. These databases track when security advisories were published or when fixed package versions were released. The fix date information includes:
Date: When the fix became available
Kind: The type of evidence (such as “advisory” for security advisory publication dates or “snapshot” for package repository snapshots)
Fix dates improve matching accuracy by allowing Grype to determine whether a vulnerability existed in a package at a specific point in time.
Data freshness
Different providers use different strategies for keeping vulnerability data current:
Incremental updates:
Some providers support incremental updates that fetch only changed data since the last run:
GitHub Security Advisories: Uses an updatedSince timestamp parameter to fetch only advisories modified after a specific date
NVD: Uses lastModStartDate and lastModEndDate parameters to fetch only CVEs modified within a date range
Red Hat: Downloads minimal CVE summaries, then selectively fetches full CVE details for relevant vulnerabilities; performs full synchronization every 2 days by default
Full refresh:
Other providers re-download and re-process all data each run, though they may use caching to improve performance:
Git-based providers (Ubuntu, Bitnami, Alma): Clone the entire Git repository each run
Feed-based providers (Debian, Alpine, Oracle, SLES): Download complete feeds, which may be cached locally
RSS-based providers (Amazon): Parse RSS feeds and fetch advisory pages
The update strategy affects how quickly new vulnerability data appears in Grype. Providers with incremental updates can fetch recent changes more efficiently, while full refresh providers ensure complete data consistency at the cost of higher bandwidth usage.
Learn how to scan container images and filesystems for software licenses covering detection, compliance checking, and managing license obligations.
License scanning involves automatically identifying and analyzing the licenses associated with the various software components used in a project.
This is important because most software relies on third-party and open-source components, each with its own licensing terms that dictate how the software can be used, modified, and distributed, and failing to comply can lead to legal issues.
Grant is an open-source command-line tool designed to discover and report on the software licenses present in container images, SBOM documents, or filesystems. It helps users understand the licenses of their software dependencies and can check them against user-defined policies to ensure compliance.
4.3.1 - Getting Started
License Scanning Getting Started
Introduction
Grant searches SBOMs for licenses and the packages they belong to.
Install the latest Grant release
Grant is provided as a single compiled executable. Issue the command for your platform to download the latest release of Grant. The full list of official and community maintained packages can be found on the installation page.
Configure authentication for scanning container images from private registries using credentials, registry tokens, and credential helpers.
The Anchore OSS tools analyze container images from private registries using multiple authentication methods.
When a container runtime isn’t available, the tools use the go-containerregistry library to handle authentication directly with registries.
When using a container runtime explicitly (for instance, with the --from docker flag) the tools defer to the runtime’s authentication mechanisms.
However, if the registry source is used, the tools use the Docker configuration file and any configured credential helpers to authenticate with the registry.
Registry tokens and personal access tokens
Many registries support personal access tokens (PATs) or registry tokens for authentication. Use docker login with your token, then the tools can use the cached credentials:
# GitHub Container Registry - create token at https://github.com/settings/tokens (needs read:packages scope)docker login ghcr.io -u <username> -p <token>
syft ghcr.io/username/private-image:latest
# GitLab Container Registry - use deploy token or personal access tokendocker login registry.gitlab.com -u <username> -p <token>
syft registry.gitlab.com/group/project/image:latest
The tools read credentials from ~/.docker/config.json, the same file Docker uses when you run docker login. This file can contain either basic authentication credentials or credential helper configurations.
Here are examples of what the config looks like if you are crafting it manually:
docker run -v ./config.json:/auth/config.json -e "DOCKER_CONFIG=/auth" anchore/syft:latest <private_image>
Docker credential helpers
Docker credential helpers are specialized programs that securely store and retrieve registry credentials. They’re particularly useful for cloud provider registries that use dynamic, short-lived tokens.
Instead of storing passwords as plaintext in config.json, you configure helpers that generate credentials on-demand. This is facilitated by the google/go-containerregistry library.
Configuring credential helpers
Add credential helpers to your config.json:
{
"credHelpers": {
// using the docker-credential-gcr for Google Container Registry and Artifact Registry
"gcr.io": "gcr",
"us-docker.pkg.dev": "gcloud",
// using the amazon-ecr-credential-helper for AWS Elastic Container Registry
"123456789012.dkr.ecr.us-west-2.amazonaws.com": "ecr-login",
// using the docker-credential-acr for Azure Container Registry
"myregistry.azurecr.io": "acr" }
}
When the tools access these registries, they execute the corresponding helper program (for example, docker-credential-gcr, or more generically docker-credential-NAME where NAME is the config value) to obtain credentials.
Note
If both credHelpers and auths are configured for the same registry, credHelpers takes precedence.
For more information about Docker credential helpers for various cloud providers:
When running the tools in Kubernetes and you need access to private registries, mount Docker credentials as a secret.
Create secret
Create a Kubernetes secret containing your Docker credentials. The key config.json is important—it becomes the filename when mounted into the pod.
For more information about the credential file format, see the go-containerregistry config docs.
# Base64 encode your config.jsoncat ~/.docker/config.json | base64
# Apply the secretkubectl apply -f secret.yaml
Configure pod
Configure your pod to use the credential secret. The DOCKER_CONFIG environment variable tells the tools where to look for credentials.
Setting DOCKER_CONFIG=/config means the tools look for credentials at /config/config.json.
This matches the secret key config.json we created above—when Kubernetes mounts secrets, each key becomes a file with that name.
The volumeMounts section mounts the secret to /config, and the volumes section references the secret created in the previous step.
Guidelines for developing & contributing to Anchore Open Source projects
Anchore OSS Contribution Guidelines
Each tool has their own slightly different guide, linked below. However, some of the guidelines are common across all tools, and are shown in the next section, General Guidelines.
This document is the single source of truth for how to contribute to the code base. We’d love to accept your patches and contributions to this project. There are just a few small guidelines you need to follow.
Sign off your work
The sign-off is an added line at the end of the explanation for the commit, certifying that you wrote it or otherwise have the right to submit it as an open-source patch. By submitting a contribution, you agree to be bound by the terms of the DCO Version 1.1 and Apache License Version 2.0.
Signing off a commit certifies the below Developer’s Certificate of Origin (DCO):
Developer's Certificate of Origin 1.1
By making a contribution to this project, I certify that:
(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
When committing your change, you can add the required line manually so that it looks like this:
Signed-off-by: John Doe <john.doe@example.com>
Creating a signed-off commit is then possible with -s or --signoff:
git commit -s -m "this is a commit message"
To double-check that the commit was signed-off, look at the log output:
$ git log -1
commit 37ceh170e4hb283bb73d958f2036ee5k07e7fde7 (HEAD -> issue-35, origin/main, main)
Author: John Doe <john.doe@example.com>
Date: Mon Aug 1 11:27:13 2020 -0400
this is a commit message
Signed-off-by: John Doe <john.doe@example.com>
Test your changes
This project has a Makefile which includes many helpers running both unit and integration tests. You can run make help to see all the options. Although PRs will have automatic checks for these, it is useful to run them locally, ensuring they pass before submitting changes. Ensure you’ve bootstrapped once before running tests:
make bootstrap
You only need to bootstrap once. After the bootstrap process, you can run the tests as many times as needed:
make unit
make integration
You can also run make all to run a more extensive test suite, but there is additional configuration that will be needed for those tests to run correctly. We will not cover the extra steps here.
Pull Request
If you made it this far and all the tests are passing, it’s time to submit a Pull Request (PR) for the project. Submitting a PR is always a scary moment as what happens next can be an unknown. The projects strive to be easy to work with, we appreciate all contributions. Nobody is going to yell at you or try to make you feel bad. We love contributions and know how scary that first PR can be.
PR Title and Description
Just like the commit title and description mentioned above, the PR title and description is very important for letting others know what’s happening. Please include any details you think a reviewer will need to more properly review your PR.
A PR that is very large or poorly described has a higher likelihood of being pushed to the end of the list. Reviewers like PRs they can understand and quickly review.
What to expect next
Please be patient with the project. We try to review PRs in a timely manner, but this is highly dependent on all the other tasks we have going on. It’s OK to ask for a status update every week or two, it’s not OK to ask for a status update every day.
It’s very likely the reviewer will have questions and suggestions for changes to your PR. If your changes don’t match the current style and flow of the other code, expect a request to change what you’ve done.
Document your changes
And lastly, when proposed changes are modifying user-facing functionality or output, it is expected the PR will include updates to the documentation as well. Our projects are not heavy on documentation. This will mostly be updating the README and help for the tool.
If nobody knows new features exist, they can’t use them!
5.1 - Syft
Developer guidelines when contributing to Syft
We welcome contributions to the project! There are a few useful things to know before diving into the codebase.
Do also take note of the General Guidelines that apply accross all Anchore Open Source projects.
Getting started
In order to test and develop in the Syft repo you will need the following dependencies installed:
Golang
docker
make
Python (>= 3.9)
Docker settings for getting started
Make sure you’ve updated your docker settings so the default docker socket path is available.
Go to:
docker -> settings -> advanced
Make sure:
Allow the default Docker socket to be used
is checked.
Also double check that the docker context being used is the default context. If it is not, run:
docker context use default
After cloning the following step can help you get setup:
run make bootstrap to download go mod dependencies, create the /.tmp dir, and download helper utilities.
run make to view the selection of developer commands in the Makefile
run make build to build the release snapshot binaries and packages
for an even quicker start you can run go run cmd/syft/main.go to print the syft help.
this command go run cmd/syft/main.go alpine:latest will compile and run syft against alpine:latest
view the README or syft help output for more output options
The main make tasks for common static analysis and testing are lint, format, lint-fix, unit, integration, and cli.
See make help for all the current make tasks.
Internal Artifactory Settings
Not always applicable
Some companies have Artifactory setup internally as a solution for sourcing secure dependencies.
If you’re seeing an issue where the unit tests won’t run because of the below error then this section might be relevant for your use case.
[ERROR] [ERROR] Some problems were encountered while processing the POMs
If you’re dealing with an issue where the unit tests will not pull/build certain java fixtures check some of these settings:
a settings.xml file should be available to help you communicate with your internal artifactory deployment
this can be moved to syft/pkg/cataloger/java/test-fixtures/java-builds/example-jenkins-plugin/ to help build the unit test-fixtures
you’ll also want to modify the build-example-jenkins-plugin.sh to use settings.xml
For more information on this setup and troubleshooting see issue 1895
Architecture
At a high level, this is the package structure of syft:
./cmd/syft/
│ ├── cli/
│ │ ├── cli.go // where all commands are wired up
│ │ ├── commands/ // all command implementations
│ │ ├── options/ // all command flags and configuration options
│ │ └── ui/ // all handlers for events that are shown on the UI
│ └── main.go // entrypoint for the application
└── syft/ // the "core" syft library
├── format/ // contains code to encode or decode to and from SBOM formats
├── pkg/ // contains code to catalog packages from a source
├── sbom/ // contains the definition of an SBOM
└── source/ // contains code to create a source object for some input type (e.g. container image, directory, etc)
Syft’s core library is implemented in the syft package and subpackages, where the major packages are:
the syft/source package produces a source.Source object that can be used to catalog a directory, container, and other source types.
the syft package contains a single function that can take a source.Source object and catalog it, producing an sbom.SBOM object
the syft/format package contains the ability to encode and decode SBOMs to and from different SBOM formats (such as SPDX and CycloneDX)
The cmd package at the highest level execution flow wires up spf13/cobra commands for execution in the main application:
sequenceDiagram
participant main as cmd/syft/main
participant cli as cli.New()
participant root as root.Execute()
participant cmd as <command>.Execute()
main->>+cli:
Note right of cli: wire ALL CLI commands
Note right of cli: add flags for ALL commands
cli-->>-main: root command
main->>+root:
root->>+cmd:
cmd-->>-root: (error)
root-->>-main: (error)
Note right of cmd: Execute SINGLE command from USER
The packages command uses the core library to generate an SBOM for the given user input:
sequenceDiagram
participant source as source.New(ubuntu:latest)
participant sbom as sbom.SBOM
participant catalog as syft.CatalogPackages(src)
participant encoder as syft.Encode(sbom, format)
Note right of source: use "ubuntu:latest" as SBOM input
source-->>+sbom: add source to SBOM struct
source-->>+catalog: pass src to generate catalog
catalog-->-sbom: add cataloging results onto SBOM
sbom-->>encoder: pass SBOM and format desired to syft encoder
encoder-->>source: return bytes that are the SBOM of the original input
Note right of catalog: cataloger configuration is done based on src
The pkg.Package object is a core data structure that represents a software package. Fields like name and version probably don’t need
a detailed explanation, but some of the other fields are worth a quick overview:
FoundBy: the name of the cataloger that discovered this package (e.g. python-pip-cataloger).
Locations: these are the set of paths and layer ids that were parsed to discover this package (e.g. python-pip-cataloger).
Language: the language of the package (e.g. python).
Type: this is a high-level categorization of the ecosystem the package resides in. For instance, even if the package is a egg, wheel, or requirements.txt reference, it is still logically a “python” package. Not all package types align with a language (e.g. rpm) but it is common.
Metadata: specialized data for specific location(s) parsed. We should try and raise up as much raw information that seems useful. As a rule of thumb the object here should be as flat as possible and use the raw names and values from the underlying source material parsed.
When pkg.Package is serialized an additional MetadataType is shown. This is a label that helps consumers understand the datashape of the Metadata field.
By convention the MetadataType value should follow these rules of thumb:
Only use lowercase letters, numbers, and hyphens. Use hyphens to separate words.
Try to anchor the name in the ecosystem, language, or packaging tooling it belongs to. For a package manager for a language ecosystem the language, framework or runtime should be used as a prefix. For instance pubspec-lock is an OK name, but dart-pubspec-lock is better. For an OS package manager this is not necessary (e.g. apk-db-entry is a good name, but alpine-apk-db-entry is not since alpine and the a in apk is redundant).
Be as specific as possible to what the data represents. For instance ruby-gem is NOT a good MetadataType value, but ruby-gemspec is. Why? Ruby gem information can come from a gemspec file or a Gemfile.lock, which are very different. The latter name provides more context as to what to expect.
Should describe WHAT the data is, NOT HOW it’s used. For instance r-description-installed-file is NOT a good MetadataType value since it’s trying to convey that we use the DESCRIPTION file in the R ecosystem to detect installed packages. Instead simply describe what the DESCRIPTION file is itself without context of how it’s used: r-description.
Use the lock suffix to distinct between manifest files that loosely describe package version requirements vs files that strongly specify one and only one version of a package (“lock” files). These should only be used with respect to package managers that have the guide and lock distinction, but would not be appropriate otherwise (e.g. rpm does not have a guide vs lock, so lock should NOT be used to describe a db entry).
Use the archive suffix to indicate a package archive (e.g. rpm file, apk file, etc) that describes the contents of the package. For example an RPM file that was cataloged would have a rpm-archive metadata type (not to be confused with an RPM DB record entry which would be rpm-db-entry).
Use the entry suffix to indicate information about a package that was found as a single entry within file that has multiple package entries. If the entry was found within a DB or a flat-file store for an OS package manager, you should use db-entry.
Should NOT contain the phrase package, though exceptions are allowed (say if the canonical name literally has the phrase package in it).
Should NOT contain have a file suffix unless the canonical name has the term “file”, such as a pipfile or gemfile. An example of a bad name for this rule isruby-gemspec-file; a better name would be ruby-gemspec.
Should NOT contain the exact filename+extensions. For instance pipfile.lock shouldn’t really be in the name, instead try and describe what the file is: python-pipfile-lock (but shouldn’t this be python-pip-lock you might ask? No, since the pip package manger is not related to the pipfile project).
Should NOT contain the phrase metadata, unless the canonical name has this term.
Should represent a single use case. For example, trying to describe Hackage metadata with a single HackageMetadata struct (and thus MetadataType) is not allowed since it represents 3 mutually exclusive use cases: representing a stack.yaml, stack.lock, or cabal.project file. Instead, each of these should have their own struct types and MetadataType values.
There are other cases that are not covered by these rules… and that’s ok! The goal is to provide a consistent naming scheme that is easy to understand and use when it’s applicable. If the rules do not exactly apply in your situation then just use your best judgement (or amend these rules as needed whe new common cases come up).
What if the underlying parsed data represents multiple files? There are two approaches to this:
use the primary file to represent all the data. For instance, though the dpkg-cataloger looks at multiple files to get all information about a package, it’s the status file that gets represented.
nest each individual file’s data under the Metadata field. For instance, the java-archive-cataloger may find information from on or all of the files: pom.xml, pom.properties, and MANIFEST.MF. However, the metadata is simply `java-metadata’ with each possibility as a nested optional field.
Syft Catalogers
Catalogers are the way in which syft is able to identify and construct packages given a set a targeted list of files.
For example, a cataloger can ask syft for all package-lock.json files in order to parse and raise up javascript packages
(see how file globs and
file parser functions are used
for a quick example).
From a high level catalogers have the following properties:
They are independent from one another. The java cataloger has no idea of the processes, assumptions, or results of the python cataloger, for example.
They do not know what source is being analyzed. Are we analyzing a local directory? an image? if so, the squashed representation or all layers? The catalogers do not know the answers to these questions. Only that there is an interface to query for file paths and contents from an underlying “source” being scanned.
Packages created by the cataloger should not be mutated after they are created. There is one exception made for adding CPEs to a package after the cataloging phase, but that will most likely be moved back into the cataloger in the future.
Cataloger names should be unique and named with the following rules of thumb in mind:
Must end with -cataloger
Use lowercase letters, numbers, and hyphens only
Use hyphens to separate words
Catalogers for language ecosystems should start with the language name (e.g. python- for a cataloger that raises up python packages)
Distinct between when the cataloger is searching for evidence of installed packages vs declared packages. For example, there are currently two different gemspec-based catalogers, the ruby-gemspec-cataloger and ruby-installed-gemspec-cataloger, where the latter requires that the gemspec is found within a specifications directory (which means it was installed, not just at the root of a source repo).
Building a new Cataloger
Catalogers must fulfill the pkg.Cataloger interface in order to add packages to the SBOM.
All catalogers should be added to:
For reference, catalogers are invoked within syft one after the other, and can be invoked in parallel.
generic.NewCataloger is an abstraction syft used to make writing common components easier (see the apkdb cataloger for example usage).
It takes the following information as input:
A catalogerName to identify the cataloger uniquely among all other catalogers.
Pairs of file globs as well as parser functions to parse those files. These parser functions return a slice of pkg.Package as well as a slice of artifact.Relationship to describe how the returned packages are related. See this the apkdb cataloger parser function as an example.
Identified packages share a common pkg.Package struct so be sure that when the new cataloger is constructing a new package it is using the Package struct.
If you want to return more information than what is available on the pkg.Package struct then you can do so in the pkg.Package.Metadata section of the struct, which is unique for each pkg.Type.
See the pkg package for examples of the different metadata types that are supported today.
These are plugged into the MetadataType and Metadata fields in the above struct. MetadataType informs which type is being used. Metadata is an interface converted to that type.
Finally, here is an example of where the package construction is done within the apk cataloger:
All catalogers are provided an instance of the file.Resolver to interface with the image and search for files. The implementations for these
abstractions leverage stereoscope in order to perform searching. Here is a
rough outline how that works:
a stereoscope file.Index is searched based on the input given (a path, glob, or MIME type). The index is relatively fast to search, but requires results to be filtered down to the files that exist in the specific layer(s) of interest. This is done automatically by the filetree.Searcher abstraction. This abstraction will fallback to searching directly against the raw filetree.FileTree if the index does not contain the file(s) of interest. Note: the filetree.Searcher is used by the file.Resolver abstraction.
Once the set of files are returned from the filetree.Searcher the results are filtered down further to return the most unique file results. For example, you may have requested for files by a glob that returns multiple results. These results are filtered down to deduplicate by real files, so if a result contains two references to the same file, say one accessed via symlink and one accessed via the real path, then the real path reference is returned and the symlink reference is filtered out. If both were accessed by symlink then the first (by lexical order) is returned. This is done automatically by the file.Resolver abstraction.
By the time results reach the pkg.Cataloger you are guaranteed to have a set of unique files that exist in the layer(s) of interest (relative to what the resolver supports).
Testing
Testing commands
make help shows a list of available commands
make unit, make integration, make cli, and make acceptance run those test suites (see below)
make test runs all those tests (and is therefore pretty slow)
make fixtures clears and re-fetches all test fixtures.
go test ./syft/pkg/ for example can test particular packages, assuming fixtures are already made
make clean-cache cleans all test cache. Note that subsequent test runs will be slower after this
Levels of testing
unit: The default level of test which is distributed throughout the repo are unit tests. Any _test.go file that
does not reside somewhere within the /test directory is a unit test. Other forms of testing should be organized in
the /test directory. These tests should focus on correctness of functionality in depth. % test coverage metrics
only considers unit tests and no other forms of testing.
integration: located within cmd/syft/internal/test/integration, these tests focus on the behavior surfaced by the common library
entrypoints from the syft package and make light assertions about the results surfaced. Additionally, these tests
tend to make diversity assertions for enum-like objects, ensuring that as enum values are added to a definition
that integration tests will automatically fail if no test attempts to use that enum value. For more details see
the “Data diversity and freshness assertions” section below.
cli: located with in test/cli, these are tests that test the correctness of application behavior from a
snapshot build. This should be used in cases where a unit or integration test will not do or if you are looking
for in-depth testing of code in the cmd/ package (such as testing the proper behavior of application configuration,
CLI switches, and glue code before syft library calls).
acceptance: located within test/compare and test/install, these are smoke-like tests that ensure that application
packaging and installation works as expected. For example, during release we provide RPM packages as a download
artifact. We also have an accompanying RPM acceptance test that installs the RPM from a snapshot build and ensures the
output of a syft invocation matches canned expected output. New acceptance tests should be added for each release artifact
and architecture supported (when possible).
Data diversity and freshness assertions
It is important that tests against the codebase are flexible enough to begin failing when they do not cover “enough”
of the objects under test. “Cover” in this case does not mean that some percentage of the code has been executed
during testing, but instead that there is enough diversity of data input reflected in testing relative to the
definitions available.
For instance, consider an enum-like value like so:
type Language stringconst (
Java Language = "java" JavaScript Language = "javascript" Python Language = "python" Ruby Language = "ruby" Go Language = "go")
Say we have a test that exercises all the languages defined today:
funcTestCatalogPackages(t *testing.T) {
testTable := []struct {
// ... the set of test cases that test all languages }
for _, test :=range cases {
t.Run(test.name, func (t *testing.T) {
// use inputFixturePath and assert that syft.CatalogPackages() returns the set of expected Package objects// ... })
}
}
Where each test case has a inputFixturePath that would result with packages from each language. This test is
brittle since it does not assert that all languages were exercised directly and future modifications (such as
adding a new language) won’t be covered by any test cases.
To address this the enum-like object should have a definition of all objects that can be used in testing:
type Language string// const( Java Language = ..., ... )var AllLanguages = []Language{
Java,
JavaScript,
Python,
Ruby,
Go,
Rust,
}
Allowing testing to automatically fail when adding a new language:
funcTestCatalogPackages(t *testing.T) {
testTable := []struct {
// ... the set of test cases that (hopefully) covers all languages }
// new stuff... observedLanguages := strset.New()
for _, test :=range cases {
t.Run(test.name, func (t *testing.T) {
// use inputFixturePath and assert that syft.CatalogPackages() returns the set of expected Package objects// ...// new stuff...for _, actualPkg :=range actual {
observedLanguages.Add(string(actualPkg.Language))
}
})
}
// new stuff...for _, expectedLanguage :=range pkg.AllLanguages {
if !observedLanguages.Contains(expectedLanguage) {
t.Errorf("failed to test language=%q", expectedLanguage)
}
}
}
This is a better test since it will fail when someone adds a new language but fails to write a test case that should
exercise that new language. This method is ideal for integration-level testing, where testing correctness in depth
is not needed (that is what unit tests are for) but instead testing in breadth to ensure that units are well integrated.
A similar case can be made for data freshness; if the quality of the results will be diminished if the input data
is not kept up to date then a test should be written (when possible) to assert any input data is not stale.
An example of this is the static list of licenses that is stored in internal/spdxlicense for use by the SPDX
presenters. This list is updated and published periodically by an external group and syft can grab and update this
list by running go generate ./... from the root of the repo.
An integration test has been written to grabs the latest license list version externally and compares that version
with the version generated in the codebase. If they differ, the test fails, indicating to someone that there is an
action needed to update it.
_The key takeaway is to try and write tests that fail when data assumptions change and not just when code changes._
Snapshot tests
The format objects make a lot of use of “snapshot” testing, where you save the expected output bytes from a call into the
git repository and during testing make a comparison of the actual bytes from the subject under test with the golden
copy saved in the repo. The “golden” files are stored in the test-fixtures/snapshot directory relative to the go
package under test and should always be updated by invoking go test on the specific test file with a specific CLI
update flag provided.
Many of the Format tests make use of this approach, where the raw SBOM report is saved in the repo and the test
compares that SBOM with what is generated from the latest presenter code. The following command can be used to
update the golden files for the various snapshot tests:
make update-format-golden-files
These flags are defined at the top of the test files that have tests that use the snapshot files.
Snapshot testing is only as good as the manual verification of the golden snapshot file saved to the repo! Be careful
and diligent when updating these files.
5.2 - Grype
Developer guidelines when contributing to Grype
There are a few useful things to know before diving into the codebase. This project depends on a few things being available like a vulnerability database, which you might want to create manually instead of retrieving a released version.
Do also take note of the General Guidelines that apply accross all Anchore Open Source projects.
Getting started
After cloning do the following:
run go build ./cmd/grype to get a binary named main from the source (use -o <name> to get a differently named binary), or optionally go run ./cmd/grype to run from source.
In order to run tests and build all artifacts:
run make bootstrap to download go mod dependencies, create the /.tmp dir, and download helper utilities (this only needs to be done once or when build tools are updated).
run make to run linting, tests, and other verifications to make certain everything is working alright.
The main make tasks for common static analysis and testing are lint, format, lint-fix, unit, and integration.
See make help for all the current make tasks.
Relationship to Syft
Grype uses Syft as a library for all-things related to obtaining and parsing the given scan target (pulling container
images, parsing container images, indexing directories, cataloging packages, etc). Releases of Grype should
always use released versions of Syft (commits that are tagged and show up in the GitHub releases page). However,
continually integrating unreleased Syft changes into Grype incrementally is encouraged
(e.g. go get github.com/anchore/syft@main) as long as by the time a release is cut the Syft version is updated
to a released version (e.g. go get github.com/anchore/syft@v<semantic-version>).
Inspecting the database
The currently supported database format is Sqlite3. Install sqlite3 in your system and ensure that the sqlite3 executable is available in your path. Ask grype about the location of the database, which will be different depending on the operating system:
$ go run ./cmd/grype db status
Location: /Users/alfredo/Library/Caches/grype/db
Built: 2020-07-31 08:18:29 +0000 UTC
Current DB Version: 1
Require DB Version: 1
Status: Valid
The database is located within the XDG_CACHE_HOME path. To verify the database filename, list that path:
To make the reporting from Sqlite3 easier to read, enable the following:
sqlite> .mode column
sqlite> .headers on
List the tables:
sqlite> .tables
id vulnerability vulnerability_metadata
In this example you retrieve a specific vulnerability from the nvd namespace:
sqlite> select * from vulnerability where (namespace="nvd"and package_name="libvncserver") limit 1;
id record_source package_name namespace version_constraint version_format cpes proxy_vulnerabilities
----------------------------------------------------------------------------------------------------------------------------------------------------------------CVE-2006-2450 libvncserver nvd =0.7.1 unknown ["cpe:2.3:a:libvncserver:libvncserver:0.7.1:*:*:*:*:*:*:*"] []
5.3 - Grant
Developer guidelines when contributing to Grant
We welcome contributions to the project! There are a few useful things to know before diving into the codebase.
Do also take note of the General Guidelines that apply accross all Anchore Open Source projects.
Getting Started
After pulling the repository, you can get started by running the following command to install the necessary dependencies and build grant from source
make
After building the project, you can run the following command to run the newly built binary
./snapshot/<os>-build_<>os_<arch>/grant
Keep in mind the build artifacts are placed in the snapshot directory and built for each supported platform so choose the appropriate binary for your platform.
If you just want to run the project with any local changes you have made, you can run the following command:
go run cmd/grant/main.go
Testing
You can run the tests for the project by running the following command:
make test
Linting
You can run the linter for the project by running the following command:
make static-analysis
Making a PR
Just fork the repository, make your changes on a branch, and submit a PR. We will review your changes and merge them if they are good to go.
When making a PR, please make sure to include a description of the changes you have made and the reasoning behind them.
If you are adding a new feature, please include tests for the new feature. If you are fixing a bug, please include a test that reproduces the bug and ensure that the test passes after your changes.
5.4 - Grype-DB
Developer guidelines when contributing to Grype-DB
We welcome contributions to the project! There are a few useful things to know before diving into the codebase.
Do also take note of the General Guidelines that apply accross all Anchore Open Source projects.
Getting started
This codebase is primarily Go, however, there are also Python scripts critical to the daily DB publishing process as
well as acceptance testing. You will require the following:
Python 3.8+ installed on your system. Consider using pyenv if you do not have a
preference for managing python interpreter installations.
zstd binary utility if you are packaging v6+ DB schemas
(optional)xz binary utility if you have specifically overridden the package command options
Poetry installed for dependency and virtualenv management for python dependencies, to install:
To download go tooling used for static analysis and dependent go modules run the following:
make bootstrap
Getting an initial vulnerability data cache
In order to build a grype DB you will need a local cache of vulnerability data:
make download-all-provider-cache
This will populate the ./data directory locally with everything needed to run grype-db build (without needing to run grype-db pull).
Running tests
To unit test the Go code and unit test the publisher python scripts:
make unit
To verify that all supported schema versions interop with grype run:
make acceptance
# Note: this may take a while... go make some coffee.
The main make tasks for common static analysis functions are lint, format, lint-fix, unit, cli.
See make help for all the current make tasks.
Create a new DB schema
Create a new v# schema package in the grype repo (within pkg/db)
Create a new v# schema package in the grype-db repo (use the bump-schema.py helper script) that uses the new changes from grype-db
Modify the manager/src/grype_db_manager/data/schema-info.json to pin the last-latest version to a specific version of grype and add the new schema version pinned to the “main” branch of grype (or a development branch)
Update all references in grype to use the new schema
Use the Staging DB Publisher workflow to test your DB changes with grype in a flow similar to the daily DB publisher workflow
Making a staging DB
While developing a new schema version it may be useful to get a DB built for you by the Staging DB Publisher GitHub Actions workflow.
This code exercises the same code as the Daily DB Publisher, with the exception that only a single schema is built and is validated against a given development branch of grype.
When these DBs are published you can point grype at the proper listing file like so:
grype-db is essentially an application that extracts information from upstream vulnerability data providers,
transforms it into smaller records targeted for grype consumption, and loads the individual records into a new SQLite DB.
~~~~~ "Pull" ~~~~~ ~~~~~~~~~~~~~~~~~~ "Build" ~~~~~~~~~~~~~~~~ ~~ "Package" ~~
┌─────────────────┐ ┌───────────────────┐ ┌───────────────┐ ┌─────────────┐
│ Pull vuln data │ │ Transform entries │ │ Load entries │ │ Package DB │
│ from upstream ├────►│ ├────►│ into new DB ├────►│ │
└─────────────────┘ └───────────────────┘ └───────────────┘ └─────────────┘
What makes grype-db a little more unique than a typical ETL job is the extra responsibility of needing to
transform the most recent vulnerability data shape (defined in the vunnel repo) to all supported DB schema versions.
From the perspective of the Daily DB Publisher workflow, (abridged) execution looks something like this:
┌─────────────────┐ ┌──────────────┐ ┌────────────────┐
│ Pull vuln data ├────┬────►│ Build V1 DB │────►│ Package V1 DB │ ...
└─────────────────┘ │ └──────────────┘ └────────────────┘
│ ┌──────────────┐ ┌────────────────┐
├────►│ Build V2 DB │────►│ Package V2 DB │ ...
│ └──────────────┘ └────────────────┘
│ ┌──────────────┐ ┌────────────────┐
├────►│ Build V3 DB │────►│ Package V3 DB │ ...
│ └──────────────┘ └────────────────┘
...
In order to support multiple DB schemas easily from a code-organization perspective the following abstractions exist:
Provider: responsible for providing raw vulnerability data files that are cached locally for later processing.
Processor: responsible for unmarshalling any entries given by the Provider, passing them into Transformers, and
returning any resulting entries. Note: the object definition is schema-agnostic but instances are schema-specific
since Transformers are dependency-injected into this object.
Transformer: Takes raw data entries of a specific vunnel-defined schema
and transforms the data into schema-specific entries to later be written to the database. Note: the object definition
is schema-specific, encapsulating grypeDB/v# specific objects within schema-agnostic Entry objects.
Entry: Encapsulates schema-specific database records produced by Processors/Transformers (from the provider data)
and accepted by Writers.
Writer: Takes Entry objects and writes them to a backing store (today a SQLite database). Note: the object
definition is schema-specific and typically references grypeDB/v# schema-specific writers.
All the above abstractions are defined in the pkg/data Go package and are used together commonly in the following flow:
Where there is a data.Provider for each upstream data source (e.g. canonical, redhat, github, NIST, etc.),
a data.Processor for every vunnel-defined data shape (github, os, msrc, nvd, etc… defined in the vunnel repo),
a data.Transformer for every processor and DB schema version pairing, and a data.Writer for every DB schema version.
From a Go package organization perspective, the above abstractions are organized as follows:
grype-db/
└── pkg
├── data # common data structures and objects that define the ETL flow
├── process
│ ├── processors # common data.Processors to call common unmarshallers and pass entries into data.Transformers
│ ├── v1
│ │ ├── processors.go # wires up all common data.Processors to v1-specific data.Transformers
│ │ ├── writer.go # v1-specific store writer
│ │ └── transformers # v1-specific transformers
│ ├── v2
│ │ ├── processors.go # wires up all common data.Processors to v2-specific data.Transformers
│ │ ├── writer.go # v2-specific store writer
│ │ └── transformers # v2-specific transformers
│ └── ...more schema versions here...
└── provider # common code to pull, unmarshal, and cache updstream vuln data into local files
└── ...
DB structure and definitions
The definitions of what goes into the database and how to access it (both reads and writes) live in the public grype
repo under the db package. Responsibilities of grype (not grype-db) include (but are not limited to):
What tables are in the database
What columns are in each table
How each record should be serialized for writing into the database
How records should be read/written from/to the database
Providing rich objects for dealing with schema-specific data structures
The name of the SQLite DB file within an archive
The definition of a listing file and listing file entries
The purpose of grype-db is to use the definitions from grype.db and the upstream vulnerability data to
create DB archives and make them publicly available for consumption via grype.
DB listing file
The listing file contains URLs to grype DB archives that are available for download, organized by schema version, and
ordered by latest-date-first.
The definition of the listing file resides in grype, however, it is the responsibility of the grype-db repo
to generate DBs and re-create the listing file daily.
As long as grype has been configured to point to the correct listing file, the DBs can be stored separately from the
listing file, be replaced with a running service returning the listing file contents, or can be mirrored for systems
behind an air gap.
Getting a grype DB out to OSS users (daily)
There are two workflows that drive getting a new grype DB out to OSS users:
The daily data sync workflow, which uses vunnel to pull upstream vulnerability data.
The daily DB publisher workflow, which uses builds and publishes a grype DB from the data obtained in the daily data sync workflow.
Daily data sync workflow
This workflow takes the upstream vulnerability data (from canonical, redhat, debian, NVD, etc), processes it, and
writes the results to the OCI repos.
┌──────────────┐ ┌──────────────────────────────────────────────────────────┐
│ Pull alpine ├────────►│ Publish to ghcr.io/anchore/grype-db/data/alpine:<date> │
└──────────────┘ └──────────────────────────────────────────────────────────┘
┌──────────────┐ ┌──────────────────────────────────────────────────────────┐
│ Pull amazon ├────────►│ Publish to ghcr.io/anchore/grype-db/data/amazon:<date> │
└──────────────┘ └──────────────────────────────────────────────────────────┘
┌──────────────┐ ┌──────────────────────────────────────────────────────────┐
│ Pull debian ├────────►│ Publish to ghcr.io/anchore/grype-db/data/debian:<date> │
└──────────────┘ └──────────────────────────────────────────────────────────┘
┌──────────────┐ ┌──────────────────────────────────────────────────────────┐
│ Pull github ├────────►│ Publish to ghcr.io/anchore/grype-db/data/github:<date> │
└──────────────┘ └──────────────────────────────────────────────────────────┘
┌──────────────┐ ┌──────────────────────────────────────────────────────────┐
│ Pull nvd ├────────►│ Publish to ghcr.io/anchore/grype-db/data/nvd:<date> │
└──────────────┘ └──────────────────────────────────────────────────────────┘
... repeat for all upstream providers ...
Once all providers have been updated a single vulnerability cache OCI repo is updated with all of the latest vulnerability data at ghcr.io/anchore/grype-db/data:<date>. This repo is what is used downstream by the DB publisher workflow to create grype DBs.
The in-repo .grype-db.yaml and .vunnel.yaml configurations are used to define the upstream data sources, how to obtain them, and where to put the results locally.
Daily DB publishing workflow
This workflow takes the latest vulnerability data cache, builds a grype DB, and publishes it for general consumption.
The manager/ directory contains all code responsible for driving the Daily DB Publisher workflow, generating DBs
for all supported schema versions and making them available to the public. The publishing process is made of three steps
(depicted and described below):
~~~~~1. Pull ~~~~~~~~~~~~~~~~~~~~~~~2. Generate Databases ~~~~~~~~~~~~~~~~~~~~~~3. Update Listing ~~┌─────────────────┐ ┌──────────────┐ ┌───────────────┐ ┌────────────────┐ ┌─────────────────────┐
│ Pull vuln data ├──┬──►│ Build V1 DB ├────►│ Package V1 DB ├────►│ Upload Archive ├──┬──►│ Update listing file │
└─────────────────┘ │ └──────────────┘ └───────────────┘ └────────────────┘ │ └─────────────────────┘
(from the daily │ ┌──────────────┐ ┌───────────────┐ ┌────────────────┐ │
sync workflow ├──►│ Build V2 DB ├────►│ Package V2 DB ├────►│ Upload Archive ├──┤
output) │ └──────────────┘ └───────────────┘ └────────────────┘ │
│ │
└──► ...repeat for as many DB schemas are supported... ──┘
Note: Running these steps locally may result in publishing a locally generated DB to production, which should never be done.
pull: Download the latest vulnerability data from various upstream data sources into a local directory.
# from the repo rootmake download-all-provider-cache
The destination for the provider data is in the data/vunnel directory.
generate: Build databases for all supported schema versions based on the latest vulnerability data and upload them to S3.
# from the repo root# must be in a poetry shellgrype-db-manager db build-and-upload --schema-version <version>
This call needs to be repeated for all schema versions that are supported (see manager/src/grype_db_manager/data/schema-info.json).
Once built each DB is smoke tested with grype by comparing the performance of the last OSS DB with the current
(local) DB, using the vulnerability-match-label to quality differences.
Only DBs that pass validation are uploaded to S3. At this step the DBs can be downloaded from S3 but are NOT yet
discoverable via grype db download yet (this is what the listing file update will do).
update-listing: Generate and upload a new listing file to S3 based on the existing listing file and newly
discovered DB archives already uploaded to S3.
# from the repo root
# must be in a poetry shell
grype-db-manager listing update
During this step the locally crafted listing file is tested against installations of grype. The correctness of the
reports are NOT verified (since this was done in a previous step), however, in order to pass the scan must have
a non-zero count of matches found.
Once the listing file has been uploaded user-facing grype installations should pick up that there are new DBs available to download.
5.5 - SBOM Action
Developer guidelines when contributing to sbom-action
TODO
5.6 - Scan Action
Developer guidelines when contributing to scan-action
TODO
5.7 - Vunnel
Developer guidelines when contributing to Vunnel
We welcome contributions to the project! There are a few useful things to know before diving into the codebase.
Do also take note of the General Guidelines that apply accross all Anchore Open Source projects.
Getting Started
This project requires:
python (>= 3.7)
pip (>= 22.2)
uv
docker
go (>= 1.20)
posix shell (bash, zsh, etc… needed for the make dev “development shell”)
Once you have python and uv installed, get the project bootstrapped:
# clone grype and grype-db, which is needed for provider development
git clone git@github.com:anchore/grype.git
git clone git@github.com:anchore/grype-db.git
# note: if you already have these repos cloned, you can skip this step. However, if they
# reside in a different directory than where the vunnel repo is, then you will need to
# set the `GRYPE_PATH` and/or `GRYPE_DB_PATH` environment variables for the development
# shell to function. You can add these to a local .env file in the vunnel repo root.
# clone the vunnel repo
git clone git@github.com:anchore/vunnel.git
cd vunnel
# get basic project tooling
make bootstrap
# install project dependencies
uv sync --all-extras --dev
Pre-commit is used to help enforce static analysis checks with git hooks:
uv run pre-commit install --hook-type pre-push
Developing
The easiest way to develop on a providers is to use the development shell, selecting the specific provider(s) you’d like to focus your development workflow on:
# Specify one or more providers you want to develop on.
# Any provider from the output of "vunnel list" is valid.
# Specify multiple as a space-delimited list:
# make dev providers="oracle wolfi nvd"
$ make dev provider="oracle"
Entering vunnel development shell...
• Configuring with providers: oracle ...
• Writing grype config: /Users/wagoodman/code/vunnel/.grype.yaml ...
• Writing grype-db config: /Users/wagoodman/code/vunnel/.grype-db.yaml ...
• Activating virtual env: /Users/wagoodman/code/vunnel/.venv ...
• Installing editable version of vunnel ...
• Building grype ...
• Building grype-db ...
Note: development builds grype and grype-db are now available in your path.
To update these builds run 'make build-grype' and 'make build-grype-db' respectively.
To run your provider and update the grype database run 'make update-db'.
Type 'exit' to exit the development shell.
You can now run the provider you specified in the make dev command, build an isolated grype DB, and import the DB into grype:
$ make update-db
• Updating vunnel providers ...[0000] INFO grype-db version: ede464c2def9c085325e18ed319b36424d71180d-adhoc-build
...[0000] INFO configured providers parallelism=1 providers=1[0000] DEBUG └── oracle
[0000] DEBUG all providers started, waiting for graceful completion...[0000] INFO running vulnerability provider provider=oracle
[0000] DEBUG oracle: 2023-03-0715:44:13 [INFO] running oracle provider
[0000] DEBUG oracle: 2023-03-0715:44:13 [INFO] downloading ELSA from https://linux.oracle.com/security/oval/com.oracle.elsa-all.xml.bz2
[0019] DEBUG oracle: 2023-03-0715:44:31 [INFO] wrote 6298 entries
[0019] DEBUG oracle: 2023-03-0715:44:31 [INFO] recording workspace state
• Building grype-db ...[0000] INFO grype-db version: ede464c2def9c085325e18ed319b36424d71180d-adhoc-build
[0000] INFO reading all provider state
[0000] INFO building DB build-directory=./build providers=[oracle] schema=5• Packaging grype-db ...[0000] INFO grype-db version: ede464c2def9c085325e18ed319b36424d71180d-adhoc-build
[0000] INFO packaging DB from="./build"for="https://toolbox-data.anchore.io/grype/databases"[0000] INFO created DB archive path=build/vulnerability-db_v5_2023-03-07T20:44:13Z_405ae93d52ac4cde6606.tar.gz
• Importing DB into grype ...Vulnerability database imported
You can now run grype that uses the newly created DB:
$ grype oraclelinux:8.4 ✔ Pulled image
✔ Loaded image
✔ Parsed image
✔ Cataloged packages [195 packages]
✔ Scanning image... [193 vulnerabilities]
├── 0 critical, 25 high, 146 medium, 22 low, 0 negligible
└── 193 fixed
NAME INSTALLED FIXED-IN TYPE VULNERABILITY SEVERITY
bind-export-libs 32:9.11.26-4.el8_4 32:9.11.26-6.el8 rpm ELSA-2021-4384 Medium
bind-export-libs 32:9.11.26-4.el8_4 32:9.11.36-3.el8 rpm ELSA-2022-2092 Medium
bind-export-libs 32:9.11.26-4.el8_4 32:9.11.36-3.el8_6.1 rpm ELSA-2022-6778 High
bind-export-libs 32:9.11.26-4.el8_4 32:9.11.36-5.el8 rpm ELSA-2022-7790 Medium
# note that we're using the database we just built...$ grype db status
Location: /Users/wagoodman/code/vunnel/.cache/grype/5# <--- this is the local DB we just built...# also note that we're using a development build of grype$ which grype
/Users/wagoodman/code/vunnel/bin/grype
The development builds of grype and grype-db provided are derived from ../grype and ../grype-db paths relative to the vunnel project.
If you want to use a different path, you can set the GRYPE_PATH and GRYPE_DB_PATH environment variables. This can be
persisted by adding a .env file to the root of the vunnel project:
# example .env file in the root of the vunnel repo
GRYPE_PATH=~/somewhere/else/grype
GRYPE_DB_PATH=~/also/somewhere/else/grype-db
To rebuild the grype and grype-db binaries from local source, run:
make build-grype
make build-grype-db
This project uses Make for running common development tasks:
make # run static analysis and unit testing
make static-analysis # run static analysis
make unit # run unit tests
make format # format the codebase with black
make lint-fix # attempt to automatically fix linting errors
...
If you want to see all of the things you can do:
make help
If you want to use a locally-editable copy of vunnel while you develop without the custom development shell:
uv pip uninstall vunnel #... if you already have vunnel installed in this virtual env
uv pip install -e .
Snapshot Tests
In order to ensure that the same feed state from providers would make the same
set of vulnerabilities, snapshot testing is used.
Snapshot tests are run as part of ordinary unit tests, and will run during
make unit.
To update snapshots, run the following pytest command. (Note that this example
is for the debian provider, and the test name and path will be different for
other providers):
Vunnel is a CLI tool that downloads and processes vulnerability data from various sources (in the codebase, these are called “providers”).
Conceptually, one or more invocations of Vunnel will produce a single data directory which Grype-DB uses to create a Grype database:
Additionally, the Vunnel CLI tool is optimized to run
a single provider at a time, not orchestrating multiple providers at once. Grype-db is the
tool that collates output from multiple providers and produces a single database, and is ultimately responsible for
orchestrating multiple Vunnel calls to prepare the input data:
A “Provider” is the core abstraction for Vunnel and represents a single source of vulnerability data. Vunnel is a CLI wrapper
around multiple vulnerability data providers.
All provider implementations should…
live under src/vunnel/providers in their own directory (e.g. the NVD provider code is under src/vunnel/providers/nvd/...)
be independent from other vulnerability providers data –that is, the debian provider CANNOT reach into the NVD data provider directory to look up information (such as severity)
follow the workspace conventions for downloaded provider inputs, produced results, and tracking of metadata
Each provider has a “workspace” directory within the “vunnel root” directory (defaults to ./data) named after the provider.
data/ # the "vunnel root" directory└── alpine/ # the provider workspace directory ├── input/ # any file that needs to be downloaded and referenced should be stored here ├── results/ # schema-compliant vulnerability results (1 record per file) ├── checksums # listing of result file checksums (xxh64 algorithm) └── metadata.json # metadata about the input and result files
The metadata.json and checksums are written out after all results are written to results/. An example metadata.json:
provider: the name of the provider that generated the results
urls: the URLs that were referenced to generate the results
listing: the path to the checksums listing file that lists all of the results, the checksum of that file, and the algorithm used to checksum the file (and the same algorithm used for all contained checksums)
timestamp: the point in time when the results were generated or last updated
schema: the data shape that the current file conforms to
All results from a provider are handled by a common base class helper (provider.Provider.results_writer()) and is driven
by the application configuration (e.g. JSON flat files or SQLite database). The data shape of the results are
self-describing via an envelope with a schema reference. For example:
the schema field is a URL to the schema that describes the data shape of the item field
the identifier field should have a unique identifier within the context of the provider results
the item field is the actual vulnerability data, and the shape of this field is defined by the schema
Note that the identifier is 3.3/cve-2015-8366 and not just cve-2015-8366 in order to uniquely identify
cve-2015-8366 as applied to the alpine 3.3 distro version among other records in the results directory.
Currently only JSON payloads are supported at this time.
Possible vulnerability schemas supported within the vunnel repo are:
If at any point a breaking change needs to be made to a provider (and say the schema remains the same), then you
can set the __version__ attribute on the provider class to a new integer value (incrementing from 1 onwards). This
is a way to indicate that the cached input/results are not compatible with the output of the current version of the
provider, in which case the next invocation of the provider will delete the previous input and results before running.
Provider configurations
Each provider has a configuration object defined next to the provider class. This object is used in the vunnel application
configuration and is passed as input to the provider class. Take the debian provider configuration for example:
from dataclasses import dataclass, field
from vunnel import provider, result
@dataclass
classConfig:
runtime: provider.RuntimeConfig = field(
default_factory=lambda: provider.RuntimeConfig(
result_store=result.StoreStrategy.SQLITE,
existing_results=provider.ResultStatePolicy.DELETE_BEFORE_WRITE,
),
)
request_timeout: int=125
Every provider configuration must:
be a dataclass
have a runtime field that is a provider.RuntimeConfig field
The runtime field is used to configure common behaviors of the provider that are enforced within the vunnel.provider.Provider subclass. Options include:
on_error: what to do when the provider fails, sub fields include:
action: choose to fail, skip, or retry when the failure occurs
retry_count: the number of times to retry the provider before failing (only applicable when action is retry)
retry_delay: the number of seconds to wait between retries (only applicable when action is retry)
input: what to do about the input data directory on failure (such as keep or delete)
results: what to do about the results data directory on failure (such as keep or delete)
existing_results: what to do when the provider is run again and the results directory already exists. Options include:
delete-before-write: delete the existing results just before writing the first processed (new) result
delete: delete existing results before running the provider
keep: keep the existing results
existing_input: what to do when the provider is run again and the input directory already exists. Options include:
delete: delete the existing input before running the provider
keep: keep the existing input
result_store: where to store the results. Options include:
sqlite: store results as key-value form in a SQLite database, where keys are the record identifiers values are the json vulnerability records
flat-file: store results in JSON files named after the record identifiers
Any provider-specific config options can be added to the configuration object as needed (such as request_timeout, which is a common field).
Adding a new provider
“Vulnerability matching” is the process of taking a list of vulnerabilities and matching them against a list of packages.
A provider in this repo is responsible for the “vulnerability” side of this process. The “package” side is handled by
Syft. A prerequisite for adding a new provider is that Syft can catalog the package types that
the provider is feeding vulnerability data for, so Grype can perform the matching from these two sources.
To add a new provider, you will need to create a new provider class under /src/vunnel/providers/<name> that inherits from provider.Provider and implements:
name(): a unique and semantically-useful name for the provider (same as the name of the directory)
update(): downloads and processes the raw data, writing all results with self.results_writer()
All results must conform to a particular schema, today there are a few kinds:
os: a generic operating system vulnerability (e.g redhat, debian, ubuntu, alpine, wolfi, etc.)
nvd: tailored to describe vulnerabilities from the NVD
github-security-advisory: tailored to describe vulnerabilities from GitHub
Once the provider is implemented, you will need to wire it up into the application in a couple places:
add a new entry under the dispatch table in src/vunnel/providers/__init__.py mapping your provider name to the class
add the provider configuration to the application configuration under src/vunnel/cli/config.py (specifically the Providers dataclass)
For a more detailed example on the implementation details of a provider see the “example” provider.
Validating this provider has different implications depending on what is being added. For example, if the provider is
adding a new vulnerability source but is ultimately using an existing schema to express results then there may be very little to do!
If you are adding a new schema, then the downstream data pipeline will need to be altered to support reading data in the new schema.
Please feel free to reach out to a maintainer on an incomplete draft PR and we can help you get it over the finish line!
…for an existing schema
1. Fork Vunnel and add the new provider.
Take a look at the example provider in the example directory. You are encouraged to copy example/awesome/* into
src/vunnel/providers/YOURPROVIDERNAME/ and modify it to fit the needs of your new provider, however, this is not required:
# from the root of the vunnel repo
cp -a example/awesome src/vunnel/providers/YOURPROVIDERNAME
See the “example” provider README as well as the code comments for steps and considerations to take when implementing a new provider.
Once implemented, you should be able to see the new provider in the vunnel list command and run it with vunnel run <name>.
The entries written should write out to a specific namespace in the DB downstream, as indicated in the record.
This namespace is needed when making Grype changes.
While developing the provider consider using the make dev provider="<your-provider-name>"developer shell to run the provider and manually test the results against grype.
At this point you can optionally open a Vunnel PR with your new provider and a Maintainer can help with the next steps. Or if you’d like to get PR changes merged faster you can continue with the next steps.
2. Fork Grype and map distro type to a specific namespace.
This step might not be needed depending on the provider.
Common reasons for needing Grype changes include:
Grype does not support the distro type and it needs to be added. See the grype/distro/types.go file to add the new distro.
Grype supports the distro already, but matching is disabled. See the grype/distro/distro.go file to enable the distro explicitly.
These images are used to test the provider on PRs and nightly builds to verify the specific provider is working.
Always use both the image tag and digest for all container image entries.
Pick an image that has a good representation of the package types that your new provider is adding vulnerability data for.
4. In Vunnel: swap the tools to your Grype branch in tests/quality/config.yaml.
If you wanted to see PR quality gate checks pass with your specific Grype changes (if you have any) then you can update the
yardstick.tools[*] entries for grype to use the a version that points to your fork (w.g. your-fork-username/grype@main).
If you don’t have any grype changes needed then you can skip this step.
5. In Vunnel: add new “vulnerability match labels” to annotate True and False positive findings with Grype.
In order to evaluate the quality of the new provider, we need to know what the expected results are. This is done by
annotating Grype results with “True Positive” labels (good results) and “False Positive” labels (bad results). We’ll use
Yardstick to do this:
$ cd tests/quality
# capture results with the development version of grype (from your fork)
$ make capture provider=<your-provider-name>
# list your results
$ uv run yardstick result list | grep grype
d415064e-2bf3-4a1d-bda6-9c3957f2f71a docker.io/anc... grype@v0.58.0 2023-03...
75d1fe75-0890-4d89-a497-b1050826d9f6 docker.io/anc... grype[custom-db]@bdcefd2 2023-03...
# use the "grype[custom-db]" result UUID and explore the results and add labels to each entry
$ uv run yardstick label explore 75d1fe75-0890-4d89-a497-b1050826d9f6
# You can use the yardstick TUI to label results:
# - use "T" to label a row as a True Positive
# - use "F" to label a row as a False Positive
# - Ctrl-Z to undo a label
# - Ctrl-S to save your labels
# - Ctrl-C to quit when you are done
Later we’ll open a PR in the vulnerability-match-labels repo to persist these labels.
For the meantime we can iterate locally with the labels we’ve added.
6. In Vunnel: run the quality gate.
cd tests/quality
# runs your specific provider to gather vulnerability data, builds a DB, and runs grype with the new DB
make capture provider=<your-provider-name>
# evaluate the quality gate
make validate
This uses the latest Grype-DB release to build a DB and the specified Grype version with a DB containing only data from the new provider.
You are looking for a passing run before continuing further.
Vunnel uses the labels in the vulnerability-Match-Labels repo via a git submodule. We’ve already added labels locally
within this submodule in an earlier step. To persist these labels we need to push them to a fork and open a PR:
# fork the github.com/anchore/vulnerability-match-labels repo, but you do not need to clone it...
# from the Vunnel repo...
$ cd tests/quality/vulnerability-match-labels
$ git remote add fork git@github.com:your-fork-name/vulnerability-match-labels.git
$ git checkout -b 'add-labels-for-<your-provider-name>'
$ git status
# you should see changes from the labels/ directory for your provider that you added
$ git add .
$ git commit -m 'add labels for <your-provider-name>'
$ git push fork add-labels-for-<your-provider-name>
Note: you will not be able to open a Vunnel PR that passes PR checks until the labels are merged into the vulnerability-match-labels repo.
Once the PR is merged in the vulnerability-match-labels repo you can update the submodule in Vunnel to point to the latest commit in the vulnerability-match-labels repo.
cd tests/quality
git submodule update --remote vulnerability-match-labels
8. In Vunnel: open a PR with your new provider.
The PR will also run all of the same quality gate checks that you ran locally.
If you have Grype changes, you should also create a PR for that as well. The Vunnel PR will not pass PR checks until the Grype PR is merged and the test/quality/config.yaml file is updated to point back to the latest Grype version.
…for a new schema
This is the same process as listed above with a few additional steps:
You will need to add the new schema to the Vunnel repo in the schemas directory.
Grype-DB will need to be updated to support the new schema in the pkg/provider/unmarshal and pkg/process/v* directories.
The Vunnel tests/quality/config.yaml file will need to be updated to use development grype-db.version, pointing to your fork.
The final Vunnel PR will not be able to be merged until the Grype-DB PR is merged and the tests/quality/config.yaml file is updated to point back to the latest Grype-DB version.
What might need refactoring?
Looking to help out with improving the code quality of Vunnel, but not sure where to start?
More general ways would be to use radon to search for complexity and maintainability issues:
$ radon cc src --total-average -nb
src/vunnel/provider.py
M 115:4 Provider._on_error - B
src/vunnel/providers/alpine/parser.py
M 73:4 Parser._download - C
M 178:4 Parser._normalize - C
M 141:4 Parser._load - B
C 44:0 Parser - B
src/vunnel/providers/amazon/parser.py
M 66:4 Parser._parse_rss - C
C 164:0 JsonifierMixin - C
M 165:4 JsonifierMixin.json - C
C 32:0 Parser - B
M 239:4 PackagesHTMLParser.handle_data - B
...
The output of radon indicates the type (M=method, C=class, F=function), the path/name, and a A-F grade. Anything that’s not an A is worth taking a look at.
Ideally we should try to get wily diff output into the CI pipeline and post on a sticky PR comment to show regressions (and potentially fail the CI run).
Not everything has types
This codebase has been ported from another repo that did not have any type hints. This is OK, though ideally over time this should
be corrected as new features are added and bug fixes made.
We use mypy today for static type checking, however, the ported code has been explicitly ignored (see pyproject.toml).
If you want to make enhancements in this area consider using automated tooling such as pytype to generate types via inference into .pyi files and later merge them into the codebase with merge-pyi.
Alternatively a tool like MonkeyType can be used generate static types from runtime data and incorporate into the code.
5.8 - Stereoscope
Developer guidelines when contributing to Stereoscope
We welcome contributions to the project! There are a few useful things to know before diving into the codebase.
Do also take note of the General Guidelines that apply accross all Anchore Open Source projects.
Getting started
In order to test and develop in this repo you will need the following dependencies installed:
Golang
docker
make
podman (for benchmark and integration tests only)
containerd (for integration tests only)
skopeo (for integration tests only)
After cloning the following step can help you get setup:
run make bootstrap to download go mod dependencies, create the /.tmp dir, and download helper utilities.
run make help to view the selection of developer commands in the Makefile
The main make tasks for common static analysis and testing are lint, format, lint-fix, unit, and integration.
See make help for all the current make tasks.
Background
Stereoscope is a library for reading and manipulating container images. It is capable of parsing multiple image
sources, providing a single abstraction for interacting with them. Ultimately this provides a squashfs-like
interface for interacting with image layers as well as a content API for accessing files contained within
the image.
Overview of objects:
image.Image: Once parsed with image.Read() this object represents a container image. Consists of a sequence of image.Layer objects, a image.FileCatalog for accessing files, and filetree.SearchContext for searching for files from the squashed representation of the image filesystem. Additionally exposes GGCR v1.Image objects for accessing the raw image metadata.
image.Layer: represents a single layer of the image. Consists of a filetree.FileTree that represents the raw layer contents, and a filetree.SearchContext for searching for files relative to the raw (single layer) filetree as well as the squashed representation of the layer relative to all layers below this one. Additionally exposes GGCR v1.Layer objects for accessing the raw layer metadata.
filetree.FileTree: a tree representing a filesystem. All nodes represent real paths (paths with no link resolution anywhere in the path) and are absolute paths (start with / and contain no relative path elements [e.g. ../ or ./]). This represents the filesystem structure and each node has a reference to the file metadata for that path.
file.Reference: a unique file in the filesystem, identified by an absolute, real path as well as an integer ID (file.IDs). These are used to reference concrete nodes in the filetree.FileTree and image.FileCatalog objects.
file.Index: stores all known file.Reference and file.Metadata. Entries are indexed with a variety of ways to provide fast access to references and metadata without needing to crawl the tree. This is especially useful for speeding up globbing.
image.FileCatalog: an image-aware extension of file.Index that additionally relates image.Layers to file.IDs and provides a content API for any files contained within the image (regardless of which layer or squashed representation it exists in).
Searching for files
Searching for files is exposed to users in three ways:
search by file path
search by file glob
search by file content MIME type
Searching itself is performed two different ways:
search the image.FileCatalog on the image by a heuristic
search the filetree.FileTree directly
The “best way” to search is automatically determined in the filetree.searchContext object, exposed on image.Image and image.Layer objects as a filetree.Searcher for general use.
File trees
The filetree.FileTree object represents a filesystem and consists of filenode.Node objects. The tree itself leverages tree.Tree as a generic datastructure. What filetree.FileTree adds is the concept of file types, the semantics of each type, the ability to resolve links based on a given strategy, merging of trees with the same semantics of a union filesystem (e.g. whiteout files), and the ability to search for files via direct paths or globs.
The fs.FS abstraction has been implemented on filetree.FileTree to allow for easy integration with the standard library as well as to interop with the doublestar library to facilitate globing. Using the fs.FS abstraction for filetree operations is faster than OS interactions with the filesystem directly but relatively slower than the indexes provided by image.FileCatalog and file.Index.
filetree.FileTree objects can be created with a corresponding file.Index object by leveraging the filetree.Builder object, which aids in the indexing of files.
6 - Reference
Reference for Anchore OSS Tools
6.1 - Grype Command Line Reference
Note
This documentation was generated from Grype version 0.100.0.
A vulnerability scanner for container images, filesystems, and SBOMs.
Supports the following image sources:
grype yourrepo/yourimage:tag defaults to using images from a Docker daemon
grype path/to/yourproject a Docker tar, OCI tar, OCI directory, SIF container, or generic filesystem directory
You can also explicitly specify the scheme to use:
grype podman:yourrepo/yourimage:tag explicitly use the Podman daemon
grype docker:yourrepo/yourimage:tag explicitly use the Docker daemon
grype docker-archive:path/to/yourimage.tar use a tarball from disk for archives created from "docker save" grype oci-archive:path/to/yourimage.tar use a tarball from disk for OCI archives (from Podman or otherwise)
grype oci-dir:path/to/yourimage read directly from a path on disk for OCI layout directories (from Skopeo or otherwise)
grype singularity:path/to/yourimage.sif read directly from a Singularity Image Format (SIF) container on disk
grype dir:path/to/yourproject read directly from a path on disk (any directory)
grype file:path/to/yourfile read directly from a file on disk
grype sbom:path/to/syft.json read Syft JSON from path on disk
grype registry:yourrepo/yourimage:tag pull image directly from a registry (no container runtime required)
grype purl:path/to/purl/file read a newline separated file of package URLs from a path on disk
grype PURL read a single package PURL directly (e.g. pkg:apk/openssl@3.2.1?distro=alpine-3.20.3)
grype CPE read a single CPE directly (e.g. cpe:2.3:a:openssl:openssl:3.0.14:*:*:*:*:*)
You can also pipe in Syft JSON directly:
syft yourimage:tag -o json | grype
Usage:
grype [IMAGE] [flags]
grype [command]
Available Commands:
completion Generate a shell completion forGrype (listing local docker images)
config show the grype configuration
db vulnerability database operations
explain Ask grype to explain a set of findings
help Help about any command
version show version information
Flags:
--add-cpes-if-none generate CPEs for packages with no CPE data
--by-cve orient results by CVE instead of the original vulnerability ID when possible
-c, --config stringArray grype configuration file(s) to use
--distro string distro to match against in the format: <distro>:<version>
--exclude stringArray exclude paths from being scanned using a glob expression
-f, --fail-on string set the return code to 1if a vulnerability is found with a severity >= the given severity, options=[negligible low medium high critical]
--file string file to write the default report output to (default is STDOUT)
-h, --help help for grype
--ignore-states string ignore matches for vulnerabilities with specified comma separated fix states, options=[fixed not-fixed unknown wont-fix]
--name string set the name of the target being analyzed
--only-fixed ignore matches for vulnerabilities that are not fixed
--only-notfixed ignore matches for vulnerabilities that are fixed
-o, --output stringArray report output formatter, formats=[json table cyclonedx cyclonedx-json sarif template], deprecated formats=[embedded-cyclonedx-vex-json embedded-cyclonedx-vex-xml]
--platform string an optional platform specifier for container image sources (e.g. 'linux/arm64', 'linux/arm64/v8', 'arm64', 'linux')
--profile stringArray configuration profiles to use
-q, --quiet suppress all logging output
-s, --scope string selection of layers to analyze, options=[squashed all-layers deep-squashed] (default"squashed")
--show-suppressed show suppressed/ignored vulnerabilities in the output (only supported with table output format)
--sort-by string sort the match results with the given strategy, options=[package severity epss risk kev vulnerability] (default"risk")
-t, --template string specify the path to a Go template file (requires 'template' output to be selected)
-v, --verbose count increase verbosity (-v = info, -vv = debug)
--version version for grype
--vex stringArray a list of VEX documents to consider when producing scanning results
Use "grype [command] --help"for more information about a command.
grype config
Show the grype configuration.
Usage:
grype config [flags]
grype config [command]
Available Commands:
locations shows all locations and the order in which grype will look for a configuration file
Flags:
-h, --help help for config
--loadloadand validate the grype configuration
grype db check
Check to see if there is a database update available.
Usage:
grype db check [flags]
Flags:
-h, --help help for check
-o, --output string format to display results (available=[text, json]) (default "text")
grype db delete
Delete the vulnerability database.
Usage:
grype db delete [flags]
Flags:
-h, --help help for delete
grype db import
Import a vulnerability database archive from a local FILE or URL.
DB archives can be obtained from “https://grype.anchore.io/databases” (or running db list). If the URL has a checksum query parameter with a fully qualified digest (e.g. ‘sha256:abc728…’) then the archive/DB will be verified against this value.
Usage:
grype db import FILE | URL [flags]
Flags:
-h, --help help for import
grype db list
List all DBs available according to the listing URL.
Usage:
grype db list [flags]
Flags:
-h, --help help for list
-o, --output string format to display results (available=[text, raw, json]) (default "text")
grype db providers
List vulnerability providers that are in the database.
Usage:
grype db providers [flags]
Flags:
-h, --help help for providers
-o, --output string format to display results (available=[table, json]) (default "table")
grype db search
Search the DB for vulnerabilities or affected packages.
Usage:
grype db search [flags]
grype db search [command]
Examples:
Search for affected packages by vulnerability ID:
$ grype db search --vuln ELSA-2023-12205 Search for affected packages by package name:
$ grype db search --pkg log4j
Search for affected packages by package name, filtering down to a specific vulnerability:
$ grype db search --pkg log4j --vuln CVE-2021-44228 Search for affected packages by PURL (note: version is not considered):
$ grype db search --pkg 'pkg:rpm/redhat/openssl' # or: '--ecosystem rpm --pkg openssl
Search for affected packages by CPE (note: version/update is not considered):
$ grype db search --pkg 'cpe:2.3:a:jetty:jetty_http_server:*:*:*:*:*:*:*:*'
$ grype db search --pkg 'cpe:/a:jetty:jetty_http_server'
Available Commands:
vuln Search for vulnerabilities within the DB (supports DB schema v6+ only)
Flags:
--broad-cpe-matching allow for specific package CPE attributes to match with '*' values on the vulnerability
--distro stringArray refine to results with the given operating system (format: 'name', 'name@version', 'name@maj.min', 'name@codename')
--ecosystem string ecosystem of the package to search within
-h, --help help for search
--limit int limit the number of results returned, use 0for no limit (default5000)
--modified-after string only show vulnerabilities originally published or modified since the given date (format: YYYY-MM-DD)
-o, --output string format to display results (available=[table, json]) (default"table")
--pkg stringArray package name/CPE/PURL to search for--provider stringArray only show vulnerabilities from the given provider
--published-after string only show vulnerabilities originally published after the given date (format: YYYY-MM-DD)
--vuln stringArray only show results for the given vulnerability ID
grype db status
Display database status and metadata.
Usage:
grype db status [flags]
Flags:
-h, --help help for status
-o, --output string format to display results (available=[text, json]) (default "text")
grype db update
Download and install the latest vulnerability database.
Usage:
grype db update [flags]
Flags:
-h, --help help for update
grype explain
Ask grype to explain a set of findings.
Usage:
grype explain --id [VULNERABILITY ID] [flags]
Flags:
-h, --help help for explain
--id stringArray CVE IDs to explain
grype version
Show version information.
Usage:
grype version [flags]
Flags:
-h, --help help for version
-o, --output string the format to show the results (allowable: [text json]) (default "text")
6.2 - Syft Command Line Reference
Note
This documentation was generated from Syft version 1.33.0.
Generate a packaged-based Software Bill Of Materials (SBOM) from container images and filesystems
Usage:
syft [SOURCE] [flags]
syft [command]
Examples:
syft scan alpine:latest a summary of discovered packages
syft scan alpine:latest -o json show all possible cataloging details
syft scan alpine:latest -o cyclonedx show a CycloneDX formatted SBOM
syft scan alpine:latest -o cyclonedx-json show a CycloneDX JSON formatted SBOM
syft scan alpine:latest -o spdx show a SPDX 2.3 Tag-Value formatted SBOM
syft scan alpine:latest -o spdx@2.2 show a SPDX 2.2 Tag-Value formatted SBOM
syft scan alpine:latest -o spdx-json show a SPDX 2.3 JSON formatted SBOM
syft scan alpine:latest -o spdx-json@2.2 show a SPDX 2.2 JSON formatted SBOM
syft scan alpine:latest -vv show verbose debug information
syft scan alpine:latest -o template -t my_format.tmpl show a SBOM formatted according to given template file
Supports the following image sources:
syft scan yourrepo/yourimage:tag defaults to using images from a Docker daemon. If Docker is not present, the image is pulled directly from the registry.
syft scan path/to/a/file/or/dir a Docker tar, OCI tar, OCI directory, SIF container, or generic filesystem directory
You can also explicitly specify the scheme to use:
syft scan docker:yourrepo/yourimage:tag explicitly use the Docker daemon
syft scan podman:yourrepo/yourimage:tag explicitly use the Podman daemon
syft scan registry:yourrepo/yourimage:tag pull image directly from a registry (no container runtime required)
syft scan docker-archive:path/to/yourimage.tar use a tarball from disk for archives created from "docker save" syft scan oci-archive:path/to/yourimage.tar use a tarball from disk for OCI archives (from Skopeo or otherwise)
syft scan oci-dir:path/to/yourimage read directly from a path on disk for OCI layout directories (from Skopeo or otherwise)
syft scan singularity:path/to/yourimage.sif read directly from a Singularity Image Format (SIF) container on disk
syft scan dir:path/to/yourproject read directly from a path on disk (any directory)
syft scan file:path/to/yourproject/file read directly from a path on disk (any single file)
Available Commands:
attest Generate an SBOM as an attestation for the given [SOURCE] container image
cataloger Show available catalogers and configuration
completion Generate the autocompletion script for the specified shell
config show the syft configuration
convert Convert between SBOM formats
help Help about any command
login Log in to a registry
scan Generate an SBOM
version show version information
Flags:
--base-path string base directory for scanning, no links will be followed above this directory, and all paths will be reported relative to this directory
-c, --config stringArray syft configuration file(s) to use
--enrich stringArray enable package data enrichment from local and online sources (options: all, golang, java, javascript)
--exclude stringArray exclude paths from being scanned using a glob expression
--file string file to write the default report output to (default is STDOUT) (DEPRECATED: use: --output FORMAT=PATH)
--from stringArray specify the source behavior to use (e.g. docker, registry, oci-dir, ...)
-h, --help help for syft
-o, --output stringArray report output format (<format>=<file> to output to a file), formats=[cyclonedx-json cyclonedx-xml github-json purls spdx-json spdx-tag-value syft-json syft-table syft-text template] (default [syft-table])
--override-default-catalogers stringArray set the base set of catalogers to use (defaults to 'image' or 'directory' depending on the scan source)
--parallelism int number of cataloger workers to run in parallel
--platform string an optional platform specifier for container image sources (e.g. 'linux/arm64', 'linux/arm64/v8', 'arm64', 'linux')
--profile stringArray configuration profiles to use
-q, --quiet suppress all logging output
-s, --scope string selection of layers to catalog, options=[squashed all-layers deep-squashed] (default"squashed")
--select-catalogers stringArray add, remove, and filter the catalogers to be used
--source-name string set the name of the target being analyzed
--source-supplier string the organization that supplied the component, which often may be the manufacturer, distributor, or repackager
--source-version string set the version of the target being analyzed
-t, --template string specify the path to a Go template file
-v, --verbose count increase verbosity (-v = info, -vv = debug)
--version version for syft
Use "syft [command] --help"for more information about a command.
syft attest
Generate a packaged-based Software Bill Of Materials (SBOM) from a container image as the predicate of an in-toto attestation that will be uploaded to the image registry.
Usage:
syft attest --output [FORMAT] <IMAGE> [flags]
Examples:
syft attest --output [FORMAT] alpine:latest defaults to using images from a Docker daemon. If Docker is not present, the image is pulled directly from the registry
You can also explicitly specify the scheme to use:
syft attest docker:yourrepo/yourimage:tag explicitly use the Docker daemon
syft attest podman:yourrepo/yourimage:tag explicitly use the Podman daemon
syft attest registry:yourrepo/yourimage:tag pull image directly from a registry (no container runtime required)
syft attest docker-archive:path/to/yourimage.tar use a tarball from disk for archives created from "docker save" syft attest oci-archive:path/to/yourimage.tar use a tarball from disk for OCI archives (from Skopeo or otherwise)
syft attest oci-dir:path/to/yourimage read directly from a path on disk for OCI layout directories (from Skopeo or otherwise)
syft attest singularity:path/to/yourimage.sif read directly from a Singularity Image Format (SIF) container on disk
Flags:
--base-path string base directory for scanning, no links will be followed above this directory, and all paths will be reported relative to this directory
--enrich stringArray enable package data enrichment from local and online sources (options: all, golang, java, javascript)
--exclude stringArray exclude paths from being scanned using a glob expression
--from stringArray specify the source behavior to use (e.g. docker, registry, oci-dir, ...)
-h, --help help for attest
-k, --key string the key to use for the attestation
-o, --output stringArray report output format (<format>=<file> to output to a file), formats=[cyclonedx-json cyclonedx-xml github-json purls spdx-json spdx-tag-value syft-json syft-table syft-text template] (default [syft-json])
--override-default-catalogers stringArray set the base set of catalogers to use (defaults to 'image' or 'directory' depending on the scan source)
--parallelism int number of cataloger workers to run in parallel
--platform string an optional platform specifier for container image sources (e.g. 'linux/arm64', 'linux/arm64/v8', 'arm64', 'linux')
-s, --scope string selection of layers to catalog, options=[squashed all-layers deep-squashed] (default"squashed")
--select-catalogers stringArray add, remove, and filter the catalogers to be used
--source-name string set the name of the target being analyzed
--source-supplier string the organization that supplied the component, which often may be the manufacturer, distributor, or repackager
--source-version string set the version of the target being analyzed
syft cataloger list
List available catalogers.
Usage:
syft cataloger list [OPTIONS] [flags]
Flags:
-h, --help help for list
-o, --output string format to output the cataloger list (available: table, json)
--override-default-catalogers stringArray override the default catalogers with an expression (default [all])
--select-catalogers stringArray select catalogers with an expression
-s, --show-hidden show catalogers that have been de-selected
syft config
Show the syft configuration.
Usage:
syft config [flags]
syft config [command]
Available Commands:
locations shows all locations and the order in which syft will look for a configuration file
Flags:
-h, --help help for config
--loadloadand validate the syft configuration
Usage:
syft convert [SOURCE-SBOM] -o [FORMAT] [flags]
Examples:
syft convert img.syft.json -o spdx-json convert a syft SBOM to spdx-json, output goes to stdout
syft convert img.syft.json -o cyclonedx-json=img.cdx.json convert a syft SBOM to CycloneDX, output is written to the file "img.cdx.json"
syft convert - -o spdx-json convert an SBOM from STDIN to spdx-json
Flags:
--file string file to write the default report output to (default is STDOUT) (DEPRECATED: use: --output FORMAT=PATH)
-h, --help help for convert
-o, --output stringArray report output format (<format>=<file> to output to a file), formats=[cyclonedx-json cyclonedx-xml github-json purls spdx-json spdx-tag-value syft-json syft-table syft-text template] (default [syft-table])
-t, --template string specify the path to a Go template file
syft login
Log in to a registry.
Usage:
syft login [OPTIONS] [SERVER] [flags]
Examples:
# Log in to reg.example.com
syft login reg.example.com -u AzureDiamond -p hunter2
Flags:
-h, --help help for login
-p, --password string Password
--password-stdin Take the password from stdin
-u, --username string Username
syft scan
Generate a packaged-based Software Bill Of Materials (SBOM) from container images and filesystems.
Usage:
syft scan [SOURCE] [flags]
Examples:
syft scan alpine:latest a summary of discovered packages
syft scan alpine:latest -o json show all possible cataloging details
syft scan alpine:latest -o cyclonedx show a CycloneDX formatted SBOM
syft scan alpine:latest -o cyclonedx-json show a CycloneDX JSON formatted SBOM
syft scan alpine:latest -o spdx show a SPDX 2.3 Tag-Value formatted SBOM
syft scan alpine:latest -o spdx@2.2 show a SPDX 2.2 Tag-Value formatted SBOM
syft scan alpine:latest -o spdx-json show a SPDX 2.3 JSON formatted SBOM
syft scan alpine:latest -o spdx-json@2.2 show a SPDX 2.2 JSON formatted SBOM
syft scan alpine:latest -vv show verbose debug information
syft scan alpine:latest -o template -t my_format.tmpl show a SBOM formatted according to given template file
Supports the following image sources:
syft scan yourrepo/yourimage:tag defaults to using images from a Docker daemon. If Docker is not present, the image is pulled directly from the registry.
syft scan path/to/a/file/or/dir a Docker tar, OCI tar, OCI directory, SIF container, or generic filesystem directory
You can also explicitly specify the scheme to use:
syft scan docker:yourrepo/yourimage:tag explicitly use the Docker daemon
syft scan podman:yourrepo/yourimage:tag explicitly use the Podman daemon
syft scan registry:yourrepo/yourimage:tag pull image directly from a registry (no container runtime required)
syft scan docker-archive:path/to/yourimage.tar use a tarball from disk for archives created from "docker save" syft scan oci-archive:path/to/yourimage.tar use a tarball from disk for OCI archives (from Skopeo or otherwise)
syft scan oci-dir:path/to/yourimage read directly from a path on disk for OCI layout directories (from Skopeo or otherwise)
syft scan singularity:path/to/yourimage.sif read directly from a Singularity Image Format (SIF) container on disk
syft scan dir:path/to/yourproject read directly from a path on disk (any directory)
syft scan file:path/to/yourproject/file read directly from a path on disk (any single file)
Flags:
--base-path string base directory for scanning, no links will be followed above this directory, and all paths will be reported relative to this directory
--enrich stringArray enable package data enrichment from local and online sources (options: all, golang, java, javascript)
--exclude stringArray exclude paths from being scanned using a glob expression
--file string file to write the default report output to (default is STDOUT) (DEPRECATED: use: --output FORMAT=PATH)
--from stringArray specify the source behavior to use (e.g. docker, registry, oci-dir, ...)
-h, --help help for scan
-o, --output stringArray report output format (<format>=<file> to output to a file), formats=[cyclonedx-json cyclonedx-xml github-json purls spdx-json spdx-tag-value syft-json syft-table syft-text template] (default [syft-table])
--override-default-catalogers stringArray set the base set of catalogers to use (defaults to 'image' or 'directory' depending on the scan source)
--parallelism int number of cataloger workers to run in parallel
--platform string an optional platform specifier for container image sources (e.g. 'linux/arm64', 'linux/arm64/v8', 'arm64', 'linux')
-s, --scope string selection of layers to catalog, options=[squashed all-layers deep-squashed] (default"squashed")
--select-catalogers stringArray add, remove, and filter the catalogers to be used
--source-name string set the name of the target being analyzed
--source-supplier string the organization that supplied the component, which often may be the manufacturer, distributor, or repackager
--source-version string set the version of the target being analyzed
-t, --template string specify the path to a Go template file
syft version
Show version information.
Usage:
syft version [flags]
Flags:
-h, --help help for version
-o, --output string the format to show the results (allowable: [text json]) (default "text")
6.3 - Grype Default Configuration
Note
This documentation was generated from Grype version 0.100.0.
Grype searches for configuration files in the following locations, in order:
./.grype.yaml - current working directory
./.grype/config.yaml - app subdirectory in current working directory
The configuration file can use either .yaml or .yml extensions. The first configuration file found will be used.
log:
# suppress all logging output (env: GRYPE_LOG_QUIET)quiet: false# explicitly set the logging level (available: [error warn info debug trace]) (env: GRYPE_LOG_LEVEL)level: "warn"# file path to write logs to (env: GRYPE_LOG_FILE)file: ""dev:
# capture resource profiling data (available: [cpu, mem]) (env: GRYPE_DEV_PROFILE)profile: ""db:
# (env: GRYPE_DEV_DB_DEBUG)debug: false# the output format of the vulnerability report (options: table, template, json, cyclonedx)# when using template as the output type, you must also provide a value for 'output-template-file' (env: GRYPE_OUTPUT)output: []
# if using template output, you must provide a path to a Go template file# see https://github.com/anchore/grype#using-templates for more information on template output# the default path to the template file is the current working directory# output-template-file: .grype/html.tmpl## write output report to a file (default is to write to stdout) (env: GRYPE_FILE)file: ""# pretty-print output (env: GRYPE_PRETTY)pretty: false# distro to match against in the format: <distro>:<version> (env: GRYPE_DISTRO)distro: ""# generate CPEs for packages with no CPE data (env: GRYPE_ADD_CPES_IF_NONE)add-cpes-if-none: false# specify the path to a Go template file (requires 'template' output to be selected) (env: GRYPE_OUTPUT_TEMPLATE_FILE)output-template-file: ""# enable/disable checking for application updates on startup (env: GRYPE_CHECK_FOR_APP_UPDATE)check-for-app-update: true# ignore matches for vulnerabilities that are not fixed (env: GRYPE_ONLY_FIXED)only-fixed: false# ignore matches for vulnerabilities that are fixed (env: GRYPE_ONLY_NOTFIXED)only-notfixed: false# ignore matches for vulnerabilities with specified comma separated fix states, options=[fixed not-fixed unknown wont-fix] (env: GRYPE_IGNORE_WONTFIX)ignore-wontfix: ""# an optional platform specifier for container image sources (e.g. 'linux/arm64', 'linux/arm64/v8', 'arm64', 'linux') (env: GRYPE_PLATFORM)platform: ""search:
# selection of layers to analyze, options=[squashed all-layers deep-squashed] (env: GRYPE_SEARCH_SCOPE)scope: "squashed"# search within archives that do not contain a file index to search against (tar, tar.gz, tar.bz2, etc)# note: enabling this may result in a performance impact since all discovered compressed tars will be decompressed# note: for now this only applies to the java package cataloger (env: GRYPE_SEARCH_UNINDEXED_ARCHIVES)unindexed-archives: false# search within archives that do contain a file index to search against (zip)# note: for now this only applies to the java package cataloger (env: GRYPE_SEARCH_INDEXED_ARCHIVES)indexed-archives: true# A list of vulnerability ignore rules, one or more property may be specified and all matching vulnerabilities will be ignored.# This is the full set of supported rule fields:# - vulnerability: CVE-2008-4318# fix-state: unknown# package:# name: libcurl# version: 1.5.1# type: npm# location: "/usr/local/lib/node_modules/**"## VEX fields apply when Grype reads vex data:# - vex-status: not_affected# vex-justification: vulnerable_code_not_presentignore: []
# a list of globs to exclude from scanning, for example:# - '/etc/**'# - './out/**/*.json'# same as --exclude (env: GRYPE_EXCLUDE)exclude: []
external-sources:
# enable Grype searching network source for additional information (env: GRYPE_EXTERNAL_SOURCES_ENABLE)enable: falsemaven:
# search for Maven artifacts by SHA1 (env: GRYPE_EXTERNAL_SOURCES_MAVEN_SEARCH_MAVEN_UPSTREAM)search-maven-upstream: true# base URL of the Maven repository to search (env: GRYPE_EXTERNAL_SOURCES_MAVEN_BASE_URL)base-url: "https://search.maven.org/solrsearch/select"# (env: GRYPE_EXTERNAL_SOURCES_MAVEN_RATE_LIMIT)rate-limit: 300ms
match:
java:
# use CPE matching to find vulnerabilities (env: GRYPE_MATCH_JAVA_USING_CPES)using-cpes: falsejvm:
# (env: GRYPE_MATCH_JVM_USING_CPES)using-cpes: truedotnet:
# use CPE matching to find vulnerabilities (env: GRYPE_MATCH_DOTNET_USING_CPES)using-cpes: falsegolang:
# use CPE matching to find vulnerabilities (env: GRYPE_MATCH_GOLANG_USING_CPES)using-cpes: false# use CPE matching to find vulnerabilities for the Go standard library (env: GRYPE_MATCH_GOLANG_ALWAYS_USE_CPE_FOR_STDLIB)always-use-cpe-for-stdlib: true# allow comparison between main module pseudo-versions (e.g. v0.0.0-20240413-2b432cf643...) (env: GRYPE_MATCH_GOLANG_ALLOW_MAIN_MODULE_PSEUDO_VERSION_COMPARISON)allow-main-module-pseudo-version-comparison: falsejavascript:
# use CPE matching to find vulnerabilities (env: GRYPE_MATCH_JAVASCRIPT_USING_CPES)using-cpes: falsepython:
# use CPE matching to find vulnerabilities (env: GRYPE_MATCH_PYTHON_USING_CPES)using-cpes: falseruby:
# use CPE matching to find vulnerabilities (env: GRYPE_MATCH_RUBY_USING_CPES)using-cpes: falserust:
# use CPE matching to find vulnerabilities (env: GRYPE_MATCH_RUST_USING_CPES)using-cpes: falsestock:
# use CPE matching to find vulnerabilities (env: GRYPE_MATCH_STOCK_USING_CPES)using-cpes: true# upon scanning, if a severity is found at or above the given severity then the return code will be 1# default is unset which will skip this validation (options: negligible, low, medium, high, critical) (env: GRYPE_FAIL_ON_SEVERITY)fail-on-severity: ""registry:
# skip TLS verification when communicating with the registry (env: GRYPE_REGISTRY_INSECURE_SKIP_TLS_VERIFY)insecure-skip-tls-verify: false# use http instead of https when connecting to the registry (env: GRYPE_REGISTRY_INSECURE_USE_HTTP)insecure-use-http: false# Authentication credentials for specific registries. Each entry describes authentication for a specific authority:# - authority: the registry authority URL the URL to the registry (e.g. "docker.io", "localhost:5000", etc.) (env: SYFT_REGISTRY_AUTH_AUTHORITY)# username: a username if using basic credentials (env: SYFT_REGISTRY_AUTH_USERNAME)# password: a corresponding password (env: SYFT_REGISTRY_AUTH_PASSWORD)# token: a token if using token-based authentication, mutually exclusive with username/password (env: SYFT_REGISTRY_AUTH_TOKEN)# tls-cert: filepath to the client certificate used for TLS authentication to the registry (env: SYFT_REGISTRY_AUTH_TLS_CERT)# tls-key: filepath to the client key used for TLS authentication to the registry (env: SYFT_REGISTRY_AUTH_TLS_KEY)auth: []
# filepath to a CA certificate (or directory containing *.crt, *.cert, *.pem) used to generate the client certificate (env: GRYPE_REGISTRY_CA_CERT)ca-cert: ""# show suppressed/ignored vulnerabilities in the output (only supported with table output format) (env: GRYPE_SHOW_SUPPRESSED)show-suppressed: false# orient results by CVE instead of the original vulnerability ID when possible (env: GRYPE_BY_CVE)by-cve: false# sort the match results with the given strategy, options=[package severity epss risk kev vulnerability] (env: GRYPE_SORT_BY)sort-by: "risk"# same as --name; set the name of the target being analyzed (env: GRYPE_NAME)name: ""# allows users to specify which image source should be used to generate the sbom# valid values are: registry, docker, podman (env: GRYPE_DEFAULT_IMAGE_PULL_SOURCE)default-image-pull-source: ""# a list of VEX documents to consider when producing scanning results (env: GRYPE_VEX_DOCUMENTS)vex-documents: []
# VEX statuses to consider as ignored rules (env: GRYPE_VEX_ADD)vex-add: []
# match kernel-header packages with upstream kernel as kernel vulnerabilities (env: GRYPE_MATCH_UPSTREAM_KERNEL_HEADERS)match-upstream-kernel-headers: falsefix-channel:
redhat-eus:
# whether fixes from this channel should be considered, options are "never", "always", or "auto" (conditionally applied based on SBOM data) (env: GRYPE_FIX_CHANNEL_REDHAT_EUS_APPLY)apply: "auto"# (env: GRYPE_FIX_CHANNEL_REDHAT_EUS_VERSIONS)versions: ">= 8.0"# (env: GRYPE_TIMESTAMP)timestamp: truedb:
# location to write the vulnerability database cache (env: GRYPE_DB_CACHE_DIR)cache-dir: "~.cache~grype~db"# URL of the vulnerability database (env: GRYPE_DB_UPDATE_URL)update-url: "https://grype.anchore.io/databases"# certificate to trust download the database and listing file (env: GRYPE_DB_CA_CERT)ca-cert: ""# check for database updates on execution (env: GRYPE_DB_AUTO_UPDATE)auto-update: true# validate the database matches the known hash each execution (env: GRYPE_DB_VALIDATE_BY_HASH_ON_START)validate-by-hash-on-start: true# ensure db build is no older than the max-allowed-built-age (env: GRYPE_DB_VALIDATE_AGE)validate-age: true# Max allowed age for vulnerability database,# age being the time since it was built# Default max age is 120h (or five days) (env: GRYPE_DB_MAX_ALLOWED_BUILT_AGE)max-allowed-built-age: 120h0m0s
# fail the scan if unable to check for database updates (env: GRYPE_DB_REQUIRE_UPDATE_CHECK)require-update-check: false# Timeout for downloading GRYPE_DB_UPDATE_URL to see if the database needs to be downloaded# This file is ~156KiB as of 2024-04-17 so the download should be quick; adjust as needed (env: GRYPE_DB_UPDATE_AVAILABLE_TIMEOUT)update-available-timeout: 30s
# Timeout for downloading actual vulnerability DB# The DB is ~156MB as of 2024-04-17 so slower connections may exceed the default timeout; adjust as needed (env: GRYPE_DB_UPDATE_DOWNLOAD_TIMEOUT)update-download-timeout: 5m0s
# Maximum frequency to check for vulnerability database updates (env: GRYPE_DB_MAX_UPDATE_CHECK_FREQUENCY)max-update-check-frequency: 2h0m0s
exp:
6.4 - Syft Default Configuration
Note
This documentation was generated from Syft version 1.33.0.
Syft searches for configuration files in the following locations, in order:
./.syft.yaml - current working directory
./.syft/config.yaml - app subdirectory in current working directory
The configuration file can use either .yaml or .yml extensions. The first configuration file found will be used.
log:
# suppress all logging output (env: SYFT_LOG_QUIET)quiet: false# increase verbosity (-v = info, -vv = debug) (env: SYFT_LOG_VERBOSITY)verbosity: 0# explicitly set the logging level (available: [error warn info debug trace]) (env: SYFT_LOG_LEVEL)level: "warn"# file path to write logs to (env: SYFT_LOG_FILE)file: ""dev:
# capture resource profiling data (available: [cpu, mem]) (env: SYFT_DEV_PROFILE)profile: ""# the configuration file(s) used to load application configuration (env: SYFT_CONFIG)config: ""# the output format(s) of the SBOM report (options: syft-table, syft-text, syft-json, spdx-json, ...)# to specify multiple output files in differing formats, use a list:# output:# - "syft-json=<syft-json-output-file>"# - "spdx-json=<spdx-json-output-file>" (env: SYFT_OUTPUT)output:
- "syft-table"# file to write the default report output to (default is STDOUT) (env: SYFT_LEGACYFILE)legacyFile: ""format:
# default value for all formats that support the "pretty" option (default is unset) (env: SYFT_FORMAT_PRETTY)pretty:
template:
# path to the template file to use when rendering the output with the template output format.# Note that all template paths are based on the current syft-json schema (env: SYFT_FORMAT_TEMPLATE_PATH)path: ""# if true, uses the go structs for the syft-json format for templating.# if false, uses the syft-json output for templating (which follows the syft JSON schema exactly).## Note: long term support for this option is not guaranteed (it may change or break at any time) (env: SYFT_FORMAT_TEMPLATE_LEGACY)legacy: falsejson:
# transform any syft-json output to conform to an approximation of the v11.0.1 schema. This includes:# - using the package metadata type names from before v12 of the JSON schema (changed in https://github.com/anchore/syft/pull/1983)## Note: this will still include package types and fields that were added at or after json schema v12. This means# that output might not strictly be json schema v11 compliant, however, for consumers that require time to port# over to the final syft 1.0 json output this option can be used to ease the transition.## Note: long term support for this option is not guaranteed (it may change or break at any time) (env: SYFT_FORMAT_JSON_LEGACY)legacy: false# include space indentation and newlines# note: inherits default value from 'format.pretty' or 'false' if parent is unset (env: SYFT_FORMAT_JSON_PRETTY)pretty:
spdx-json:
# include space indentation and newlines# note: inherits default value from 'format.pretty' or 'false' if parent is unset (env: SYFT_FORMAT_SPDX_JSON_PRETTY)pretty:
cyclonedx-json:
# include space indentation and newlines# note: inherits default value from 'format.pretty' or 'false' if parent is unset (env: SYFT_FORMAT_CYCLONEDX_JSON_PRETTY)pretty:
cyclonedx-xml:
# include space indentation and newlines# note: inherits default value from 'format.pretty' or 'false' if parent is unset (env: SYFT_FORMAT_CYCLONEDX_XML_PRETTY)pretty:
# whether to check for an application update on start up or not (env: SYFT_CHECK_FOR_APP_UPDATE)check-for-app-update: true# enable one or more package catalogers (env: SYFT_CATALOGERS)catalogers: []
# set the base set of catalogers to use (defaults to 'image' or 'directory' depending on the scan source) (env: SYFT_DEFAULT_CATALOGERS)default-catalogers: []
# add, remove, and filter the catalogers to be used (env: SYFT_SELECT_CATALOGERS)select-catalogers: []
package:
# search within archives that do not contain a file index to search against (tar, tar.gz, tar.bz2, etc)# note: enabling this may result in a performance impact since all discovered compressed tars will be decompressed# note: for now this only applies to the java package cataloger (env: SYFT_PACKAGE_SEARCH_UNINDEXED_ARCHIVES)search-unindexed-archives: false# search within archives that do contain a file index to search against (zip)# note: for now this only applies to the java package cataloger (env: SYFT_PACKAGE_SEARCH_INDEXED_ARCHIVES)search-indexed-archives: true# allows users to exclude synthetic binary packages from the sbom# these packages are removed if an overlap with a non-synthetic package is found (env: SYFT_PACKAGE_EXCLUDE_BINARY_OVERLAP_BY_OWNERSHIP)exclude-binary-overlap-by-ownership: truelicense:
# include the content of licenses in the SBOM for a given syft scan; valid values are: [all unknown none] (env: SYFT_LICENSE_CONTENT)content: "none"# adjust the percent as a fraction of the total text, in normalized words, that# matches any valid license for the given inputs, expressed as a percentage across all of the licenses matched. (env: SYFT_LICENSE_COVERAGE)coverage: 75file:
metadata:
# select which files should be captured by the file-metadata cataloger and included in the SBOM.# Options include:# - "all": capture all files from the search space# - "owned-by-package": capture only files owned by packages# - "none", "": do not capture any files (env: SYFT_FILE_METADATA_SELECTION)selection: "owned-by-package"# the file digest algorithms to use when cataloging files (options: "md5", "sha1", "sha224", "sha256", "sha384", "sha512") (env: SYFT_FILE_METADATA_DIGESTS)digests:
- "sha1" - "sha256"content:
# skip searching a file entirely if it is above the given size (default = 1MB; unit = bytes) (env: SYFT_FILE_CONTENT_SKIP_FILES_ABOVE_SIZE)skip-files-above-size: 256000# file globs for the cataloger to match on (env: SYFT_FILE_CONTENT_GLOBS)globs: []
executable:
# file globs for the cataloger to match on (env: SYFT_FILE_EXECUTABLE_GLOBS)globs: []
# selection of layers to catalog, options=[squashed all-layers deep-squashed] (env: SYFT_SCOPE)scope: "squashed"# number of cataloger workers to run in parallel# by default, when set to 0: this will be based on runtime.NumCPU * 4, if set to less than 0 it will be unbounded (env: SYFT_PARALLELISM)parallelism: 0relationships:
# include package-to-file relationships that indicate which files are owned by which packages (env: SYFT_RELATIONSHIPS_PACKAGE_FILE_OWNERSHIP)package-file-ownership: true# include package-to-package relationships that indicate one package is owned by another due to files claimed to be owned by one package are also evidence of another package's existence (env: SYFT_RELATIONSHIPS_PACKAGE_FILE_OWNERSHIP_OVERLAP)package-file-ownership-overlap: truecompliance:
# action to take when a package is missing a name (env: SYFT_COMPLIANCE_MISSING_NAME)missing-name: "drop"# action to take when a package is missing a version (env: SYFT_COMPLIANCE_MISSING_VERSION)missing-version: "stub"# Enable data enrichment operations, which can utilize services such as Maven Central and NPM.# By default all enrichment is disabled, use: all to enable everything.# Available options are: all, golang, java, javascript (env: SYFT_ENRICH)enrich: []
dotnet:
# only keep dep.json packages which an executable on disk is found. The package is also included if a DLL is found for any child package, even if the package itself does not have a DLL. (env: SYFT_DOTNET_DEP_PACKAGES_MUST_HAVE_DLL)dep-packages-must-have-dll: false# only keep dep.json packages which have a runtime/resource DLL claimed in the deps.json targets section (but not necessarily found on disk). The package is also included if any child package claims a DLL, even if the package itself does not claim a DLL. (env: SYFT_DOTNET_DEP_PACKAGES_MUST_CLAIM_DLL)dep-packages-must-claim-dll: true# treat DLL claims or on-disk evidence for child packages as DLL claims or on-disk evidence for any parent package (env: SYFT_DOTNET_PROPAGATE_DLL_CLAIMS_TO_PARENTS)propagate-dll-claims-to-parents: true# show all packages from the deps.json if bundling tooling is present as a dependency (e.g. ILRepack) (env: SYFT_DOTNET_RELAX_DLL_CLAIMS_WHEN_BUNDLING_DETECTED)relax-dll-claims-when-bundling-detected: truegolang:
# search for go package licences in the GOPATH of the system running Syft, note that this is outside the# container filesystem and potentially outside the root of a local directory scan (env: SYFT_GOLANG_SEARCH_LOCAL_MOD_CACHE_LICENSES)search-local-mod-cache-licenses:
# specify an explicit go mod cache directory, if unset this defaults to $GOPATH/pkg/mod or $HOME/go/pkg/mod (env: SYFT_GOLANG_LOCAL_MOD_CACHE_DIR)local-mod-cache-dir: "~go~pkg~mod"# search for go package licences in the vendor folder on the system running Syft, note that this is outside the# container filesystem and potentially outside the root of a local directory scan (env: SYFT_GOLANG_SEARCH_LOCAL_VENDOR_LICENSES)search-local-vendor-licenses:
# specify an explicit go vendor directory, if unset this defaults to ./vendor (env: SYFT_GOLANG_LOCAL_VENDOR_DIR)local-vendor-dir: ""# search for go package licences by retrieving the package from a network proxy (env: SYFT_GOLANG_SEARCH_REMOTE_LICENSES)search-remote-licenses:
# remote proxy to use when retrieving go packages from the network,# if unset this defaults to $GOPROXY followed by https://proxy.golang.org (env: SYFT_GOLANG_PROXY)proxy: "https://proxy.golang.org,direct"# specifies packages which should not be fetched by proxy# if unset this defaults to $GONOPROXY (env: SYFT_GOLANG_NO_PROXY)no-proxy: ""main-module-version:
# look for LD flags that appear to be setting a version (e.g. -X main.version=1.0.0) (env: SYFT_GOLANG_MAIN_MODULE_VERSION_FROM_LD_FLAGS)from-ld-flags: true# search for semver-like strings in the binary contents (env: SYFT_GOLANG_MAIN_MODULE_VERSION_FROM_CONTENTS)from-contents: false# use the build settings (e.g. vcs.version & vcs.time) to craft a v0 pseudo version# (e.g. v0.0.0-20220308212642-53e6d0aaf6fb) when a more accurate version cannot be found otherwise (env: SYFT_GOLANG_MAIN_MODULE_VERSION_FROM_BUILD_SETTINGS)from-build-settings: truejava:
# enables Syft to use the network to fetch version and license information for packages when# a parent or imported pom file is not found in the local maven repository.# the pom files are downloaded from the remote Maven repository at 'maven-url' (env: SYFT_JAVA_USE_NETWORK)use-network:
# use the local Maven repository to retrieve pom files. When Maven is installed and was previously used# for building the software that is being scanned, then most pom files will be available in this# repository on the local file system. this greatly speeds up scans. when all pom files are available# in the local repository, then 'use-network' is not needed.# TIP: If you want to download all required pom files to the local repository without running a full# build, run 'mvn help:effective-pom' before performing the scan with syft. (env: SYFT_JAVA_USE_MAVEN_LOCAL_REPOSITORY)use-maven-local-repository:
# override the default location of the local Maven repository.# the default is the subdirectory '.m2/repository' in your home directory (env: SYFT_JAVA_MAVEN_LOCAL_REPOSITORY_DIR)maven-local-repository-dir: "~.m2~repository"# maven repository to use, defaults to Maven central (env: SYFT_JAVA_MAVEN_URL)maven-url: "https://repo1.maven.org/maven2"# depth to recursively resolve parent POMs, no limit if <= 0 (env: SYFT_JAVA_MAX_PARENT_RECURSIVE_DEPTH)max-parent-recursive-depth: 0# resolve transient dependencies such as those defined in a dependency's POM on Maven central (env: SYFT_JAVA_RESOLVE_TRANSITIVE_DEPENDENCIES)resolve-transitive-dependencies: falsejavascript:
# enables Syft to use the network to fill in more detailed license information (env: SYFT_JAVASCRIPT_SEARCH_REMOTE_LICENSES)search-remote-licenses:
# base NPM url to use (env: SYFT_JAVASCRIPT_NPM_BASE_URL)npm-base-url: ""# include development-scoped dependencies (env: SYFT_JAVASCRIPT_INCLUDE_DEV_DEPENDENCIES)include-dev-dependencies:
linux-kernel:
# whether to catalog linux kernel modules found within lib/modules/** directories (env: SYFT_LINUX_KERNEL_CATALOG_MODULES)catalog-modules: truenix:
# enumerate all files owned by packages found within Nix store paths (env: SYFT_NIX_CAPTURE_OWNED_FILES)capture-owned-files: falsepython:
# when running across entries in requirements.txt that do not specify a specific version# (e.g. "sqlalchemy >= 1.0.0, <= 2.0.0, != 3.0.0, <= 3.0.0"), attempt to guess what the version could# be based on the version requirements specified (e.g. "1.0.0"). When enabled the lowest expressible version# when given an arbitrary constraint will be used (even if that version may not be available/published). (env: SYFT_PYTHON_GUESS_UNPINNED_REQUIREMENTS)guess-unpinned-requirements: falseregistry:
# skip TLS verification when communicating with the registry (env: SYFT_REGISTRY_INSECURE_SKIP_TLS_VERIFY)insecure-skip-tls-verify: false# use http instead of https when connecting to the registry (env: SYFT_REGISTRY_INSECURE_USE_HTTP)insecure-use-http: false# Authentication credentials for specific registries. Each entry describes authentication for a specific authority:# - authority: the registry authority URL the URL to the registry (e.g. "docker.io", "localhost:5000", etc.) (env: SYFT_REGISTRY_AUTH_AUTHORITY)# username: a username if using basic credentials (env: SYFT_REGISTRY_AUTH_USERNAME)# password: a corresponding password (env: SYFT_REGISTRY_AUTH_PASSWORD)# token: a token if using token-based authentication, mutually exclusive with username/password (env: SYFT_REGISTRY_AUTH_TOKEN)# tls-cert: filepath to the client certificate used for TLS authentication to the registry (env: SYFT_REGISTRY_AUTH_TLS_CERT)# tls-key: filepath to the client key used for TLS authentication to the registry (env: SYFT_REGISTRY_AUTH_TLS_KEY)auth: []
# filepath to a CA certificate (or directory containing *.crt, *.cert, *.pem) used to generate the client certificate (env: SYFT_REGISTRY_CA_CERT)ca-cert: ""# specify the source behavior to use (e.g. docker, registry, oci-dir, ...) (env: SYFT_FROM)from: []
# an optional platform specifier for container image sources (e.g. 'linux/arm64', 'linux/arm64/v8', 'arm64', 'linux') (env: SYFT_PLATFORM)platform: ""source:
# set the name of the target being analyzed (env: SYFT_SOURCE_NAME)name: ""# set the version of the target being analyzed (env: SYFT_SOURCE_VERSION)version: ""# the organization that supplied the component, which often may be the manufacturer, distributor, or repackager (env: SYFT_SOURCE_SUPPLIER)supplier: ""# (env: SYFT_SOURCE_SOURCE)source: ""# base directory for scanning, no links will be followed above this directory, and all paths will be reported relative to this directory (env: SYFT_SOURCE_BASE_PATH)base-path: ""file:
# the file digest algorithms to use on the scanned file (options: "md5", "sha1", "sha224", "sha256", "sha384", "sha512") (env: SYFT_SOURCE_FILE_DIGESTS)digests:
- "SHA-256"image:
# allows users to specify which image source should be used to generate the sbom# valid values are: registry, docker, podman (env: SYFT_SOURCE_IMAGE_DEFAULT_PULL_SOURCE)default-pull-source: ""# (env: SYFT_SOURCE_IMAGE_MAX_LAYER_SIZE)max-layer-size: ""# exclude paths from being scanned using a glob expression (env: SYFT_EXCLUDE)exclude: []
unknowns:
# remove unknown errors on files with discovered packages (env: SYFT_UNKNOWNS_REMOVE_WHEN_PACKAGES_DEFINED)remove-when-packages-defined: true# include executables without any identified packages (env: SYFT_UNKNOWNS_EXECUTABLES_WITHOUT_PACKAGES)executables-without-packages: true# include archives which were not expanded and searched (env: SYFT_UNKNOWNS_UNEXPANDED_ARCHIVES)unexpanded-archives: truecache:
# root directory to cache any downloaded content; empty string will use an in-memory cache (env: SYFT_CACHE_DIR)dir: "~.cache~syft"# time to live for cached data; setting this to 0 will disable caching entirely (env: SYFT_CACHE_TTL)ttl: "7d"# show catalogers that have been de-selected (env: SYFT_SHOW_HIDDEN)show-hidden: falseattest:
# the key to use for the attestation (env: SYFT_ATTEST_KEY)key: ""# password to decrypt to given private key# additionally responds to COSIGN_PASSWORD env var (env: SYFT_ATTEST_PASSWORD)password: ""
6.5 - Configuration Reference
Configuration patterns and options used across all Anchore OSS tools
All Anchore open source tools (Syft, Grype, Grant) share the same configuration system. This guide explains how to configure these tools using command-line flags, environment variables, and configuration files.
Configuration precedence
When you configure a tool, settings are applied in a specific order. If the same setting is specified in multiple places, the tool uses the value from the highest-priority source:
Command-line arguments(highest priority)
Environment variables
Configuration file
Default values(lowest priority)
For example, if you set the log level using all three methods, the command-line flag overrides the environment variable, which overrides the config file value.
Tip
Running a tool with --verbose or -vv log level prints the entire active configuration at startup, showing you exactly which values are being used.
Viewing your configuration
To see available configuration options and current settings:
syft --help — shows all command-line flags
syft config — prints a complete sample configuration file
syft config --load — displays your current active configuration
Replace syft with the tool you’re using (grype, grant, etc.).
Using environment variables
Every configuration option can be set via environment variable. The variable name follows the path to the setting in the configuration file.
Example: To enable pretty-printed JSON output, the config file setting is:
format:
json:
pretty: true
The path from root to this value is format → json → pretty, so the environment variable is:
exportSYFT_FORMAT_JSON_PRETTY=true
The pattern is: <TOOL>_<PATH>_<TO>_<SETTING> where:
<TOOL> is the uppercase tool name (SYFT, GRYPE, GRANT)
Path segments are joined with underscores
All letters are uppercase
More examples:
# Set log level to debugexportSYFT_LOG_LEVEL=debug
# Configure output formatexportGRYPE_OUTPUT=json
# Set registry credentialsexportSYFT_REGISTRY_AUTH_USERNAME=myuser
Using a configuration file
Configuration files use YAML format. The tool searches these locations in order and uses the first file it finds:
Replace syft with your tool name (grype, grant, etc.).
Note
Only the first config file found is used — configuration files are not merged together.
7 - About
About Anchore OSS and its community
7.1 - OSS Team
Meet the team behind the tools
Anchore Open Source Team
Faces!
7.2 - Events
Anchore OSS Community Events and Meetings
Open Source Live Streams
Almost every Thursday the OSS team holds a “Gardening” live stream on the Anchore YouTube channel. Each week, we announce what time the live stream is happening in the Announcements on Discourse.
We hold open meetings with the community, on alternate Thursdays. These are on Zoom, and are not recorded or streamed. There is an optional agenda which can be filled in. Everyone is welcome. A webcam is not required.
Anchore Events
Anchore has a separate Events page, for announcing industry & corporate events, and webinars.
7.3 - OSS Adopters
Adopters of Anchore Open Source Tools
Our tools are used by organisations and developer teams of all sizes. Below is a small sample of users of our tools, in public GitHub repositories.
No adopters data found. Check your data file.
More organisations below are all adopters of our tools, in public GitHub repositories.
This style guide is for the Anchore OSS documentation.
The style guide helps contributors to write documentation that readers can understand quickly and correctly.
The Anchore OSS docs aim for:
Consistency in style and terminology, so that readers can expect certain
structures and conventions. Readers don’t have to keep re-learning how to use
the documentation or questioning whether they’ve understood something
correctly.
Clear, concise writing so that readers can quickly find and understand the
information they need.
Capitalize only the first letter of each heading within the page. (That is, use sentence case.)
Capitalize (almost) every word in page titles. (That is, use title case.)
The little words like “and”, “in”, etc, don’t get a capital letter.
In page content, use capitals only for brand names, like Syft, Anchore, and so on.
See more about brand names below.
Don’t use capital letters to emphasize words.
Spell out abbreviations and acronyms on first use
Always spell out the full term for every abbreviation or acronym the first time you use it on the page.
Don’t assume people know what an abbreviation or acronym means, even if it seems like common knowledge.
Example: “To run Grype locally in a virtual machine (VM)”
Use contractions if you want to
For example, it’s fine to write “it’s” instead of “it is”.
Use full, correct brand names
When referring to a product or brand, use the full name.
Capitalize the name as the product owners do in the product documentation.
Do not use abbreviations even if they’re in common use, unless the product owner has sanctioned the abbreviation.
Use this
Instead of this
Anchore
anchore
Kubernetes
k8s
GitHub
github
Be consistent with punctuation
Use punctuation consistently within a page.
For example, if you use a period (full stop) after every item in list, then use a period on all other lists on the page.
Check the other pages if you’re unsure about a particular convention.
Examples:
Most pages in the Anchore OSS docs use a period at the end of every list item.
There is no period at the end of the page subtitle and the subtitle need not be a full sentence.
(The subtitle comes from the description in the front matter of each page.)
Use active voice rather than passive voice
Passive voice is often confusing, as it’s not clear who should perform the action.
Use active voice
Instead of passive voice
You can configure Grype to
Grype can be configured to
Add the directory to your path
The directory should be added to your path
Use simple present tense
Avoid future tense (“will”) and complex syntax such as conjunctive mood (“would”, “should”).
Use simple present tense
Instead of future tense or complex syntax
The following command provisions a virtual machine
The following command will provision a virtual machine
If you add this configuration element, the system is open to
the Internet
If you added this configuration element, the system would be open to
the Internet
Exception: Use future tense if it’s necessary to convey the correct meaning. This requirement is rare.
Address the audience directly
Using “we” in a sentence can be confusing, because the reader may not know whether they’re part of the “we” you’re describing.
For example, compare the following two statements:
“In this release we’ve added many new features.”
“In this tutorial we build a flying saucer.”
The words “the developer” or “the user” can be ambiguous.
For example, if the reader is building a product that also has users,
then the reader does not know whether you’re referring to the reader or the users of their product.
Address the reader directly
Instead of "we", "the user", or "the developer"
Include the directory in your path
The user must make sure that the directory is included in their path
In this tutorial you build a flying saucer
In this tutorial we build a flying saucer
Use short, simple sentences
Keep sentences short. Short sentences are easier to read than long ones.
Below are some tips for writing short sentences.
Use fewer words instead of many words that convey the same meaning
Use this
Instead of this
You can use
It is also possible to use
You can
You are able to
Split a single long sentence into two or more shorter ones
Use this
Instead of this
You do not need a running GKE cluster. The deployment process
creates a cluster for you
You do not need a running GKE cluster, because the deployment
process creates a cluster for you
Use a list instead of a long sentence showing various options
Use this
Instead of this
To scan a container for vulnerabilities:
Package the software in an OCI container.
Upload the container to an online registry.
Run Grype with the container name as a parameter.
To scan a container, you must package the software in an OCI container,
upload the container to an online registry, and run Grype with the container
name as a parameter.
Avoid too much text styling
Use bold text when referring to UI controls or other UI elements.
Use code style for:
filenames, directories, and paths
inline code and commands
object field names
Avoid using bold text or capital letters for emphasis.
If a page has too much textual highlighting it becomes confusing and even annoying.
Use angle brackets for placeholders
For example:
export SYFT_PARALLELISM=<number>
--email <your email address>
Style your images
The Anchore OSS docs recognize Bootstrap classes to style images and other content.
The following code snippet shows the typical styling that makes an image show up nicely on the page:
The Google Developer Documentation Style Guide contains detailed information about specific aspects of writing clear, readable, succinct documentation for a developer audience.
Next steps
Take a look at the documentation README for guidance on contributing to the Anchore OSS docs.
8 - Release Notes
Information about recent Anchore OSS releases
The following pages show the release notes for each of our open source SBOM and vulnerability scanning tools:
A HUGE thank you to @rezmoss for his help identifying and solving an issue causing excessive time and memory consumption with large numbers of symlinks! ❤️
Add new ‘–source-version’ and ‘–source-name’ options to set the name and version of the target being analyzed for reference in resulting syft-json format SBOMs (more formats will support these flags soon). [Issue #1399] [PR #1859] [kzantow]
unintended artifactRelationship records of type ownership-by-file-overlap are being reported in SBOMs generated against current fedora container imges [Issue #1077]
2d452bf Add inline-comparison as acceptance test (#130)
4c7784d Add shell completion script (#131)
86d3336 Add macos quarantine to readme (#129)
a3a3e38 replace master with main (#128)
fa5d2b5 fix readme installation notice
817ce61 Add detailed location info to json artifact (#127)
dc8dfc8 fix panic on top-level log (#125)
f855a38 pull all commits on checkout for release to build changelog (#126)
bfc5dd8 replace fetching->loading and reading->parsing in UI (#124)
70e6732 Add poetry cataloger (#121)
e2a874a finalize json output & add schema (#118)
2560266 Initial README (#120)
8fe59c6 bump stereoscope for docker pull + add UI elements for pull status (#117)
78515da replace zap logger with logrus (#116)
076d5c2 fix ui handlers to write before first event
5320280 show message when no packages are discovered (#115)
c67e17a Merge pull request #114 from anchore/issue-111
04a1c91 java: fallback to manifest.ImplTitle when there is no name
bb81c0b tests: java cataloger tests for selecting name
e397659 pull in fix for bounds check progress formatting values in etui
271ba35 Export UI handlers for reuse in other tools (#113)
857f41b Merge pull request #112 from anchore/ignore-prerelease-versions
ad1a72c ignore prerelease verions when uploading version file on release
bc69382 Merge pull request #110 from anchore/issue-8
caecce9 tests: update integration tests to include yarn packages
713f660 cataloger: update controller to use javascript (vs. npm)
d79cece tests: verify new yarn.lock parser
5790474 pkg: define the Yarn package type
67fb132 cataloger: implement the yarn.lock parser
146b4bd cataloger: rename npm to javascript to accommodate yarn parser
msrc matcher should search by package ecosystem, not by distro [#2748@westonsteimel]
Grype does not report any vulnerabilities for CPEs with target_sw field set to value that does not correspond to known package type [#2768#2772@willmurphyscode]
[!IMPORTANT]
As of Grype v0.88.0, the listing file which hosts the URLs of databases to download has migrated from https://toolbox-data.anchore.io/grype/databases/listing.json to https://grype.anchore.io/databases/v6/latest.json.
Added Features
Show suggested fixed version when there are multiple listed [#2264#2271@tomersein]
Bug Fixes
Check for vulnerability database update failed with unsupported protocol scheme when referencing local file [#2507#2508@wagoodman]
[!IMPORTANT]
With #2126 the listing file which hosts the URLs of databases to download has migrated from https://toolbox-data.anchore.io/grype/databases/listing.json to https://grype.anchore.io/databases/v6/latest.json.
update CI to install golang at the latest version [#1949@spiffcs]
Grype is now built with the latest version of Golang at v1.22.x. This resolves a few security findings that would have been flagged against the v0.79.0 binary for using an older version of the Golang standard library.
You can now list multiple output formats and files to write to disk with one command, like Syft: “-o format1=file1 -o format1=file2” [Issue #648] [PR #1346] [olivierboudet]
Bug Fixes
Correctly detect format of CycloneDX XML SBOM with no components [Issue #1005]
Fix vulnerability summary counts to be less confusing. [Issue #1360]
Always include the specific package name and version used in the vulnerability search in the matchDetails section of the output [PR #1339] [westonsteimel]
f13b9a7 Use latest versions of anchore repos (#164)
326afa3 Add OCI support + use URI schemes (#160)
9f6301b Change root of JSON presenter to a mapping (instead of a sequence) (#163)
b2715ff Update high level docs (#162)
ed9f9bc remove duplicate rows from the summary table (#161)
ec493d5 Merge pull request #159 from anchore/update-testutils
578afab update go.mod and go.sum
c73a337 fix replacement of results with matches (#158)
f0f8f4b add –fail-on threshold support (#156)
0397206 Merge pull request #154 from anchore/issue-148
ca19b08 presenter: cyclonedx shouldn’t eat up errors
7b71401 cyclonedx tests: update BD name to use grype instead of syft
2d44839 presenter: cyclonedx document updates to pass schema validation
4f78b57 presenter: cyclonedx vulnerability schema fixes
2b8dfc2 temporary bump of go deps for testing
0fb5080 presenter: add new golden files for cyclonedx tests
46f3948 presenter: remove unneeded golden files
3de06ce presenter: join dir+img presesnter tests for cyclonedx
298a801 tests: update CycloneDX presenters with new namespaces
80d494b presenter: add xmlns for bd and v namespaces in cyclonedx output
3a57218 ci: hook the cyclonedx validation into CircleCI
57d777c tests: add cyclonedx schema check
2c1ddbe Merge pull request #152 from anchore/fix-json-keys
cb437b6 Change kebab case to camelCase, use updated syft version
ca8ac61 Rename Result object to Matches (#153)
ad7d9d5 Merge pull request #151 from anchore/fix-version-json-output-casing
9fa5064 Fix json keys to be camel case instead of kebab
293368e Shell completion via Cobra utility (#149)
0f97081 add positional argument validation (#150)
1338850 Add fixed-in-version to the presenters (#147)
bd50ffc Change search key json output to a map (#146)
c0efed5 Merge pull request #143 from anchore/issue-39
c768955 presenter: cyclonedx tests
8fc7efd result: add a helper to get packages by ID
444b191 presenter: set the options to hook CycloneDX output
48c3c2a presenter: add a cyclonedx presenter
8e8ad48 dependencies: update to latest syft and include uuid
b77e023 Merge pull request #137 from anchore/issue-94
d2949a2 matcher: add duplicate to demonstrate they don’t show up
89f8ac4 test: update integration to match new SearchMatches
46f614d tests: json presenter output updated
5428cc2 presenter: json to use a string for the search key, not a map
2d7af0b matchers: use strings for SearchKeys
87c267f matchers: cpe should prevent duplicates by not adding already present CPEs
b8a4183 vuln matches should include search matches
651751f simplify version cmd + add json option (#139)
be6a7ea Update README.md to highlight supported distros and languages (#135)
8757b47 Merge pull request #136 from anchore/issue-py-setup
b0c6dc2 test: update scope.FilesByGlob, it is now part of Resolver
b8e9431 dependencies: bump to latest syft that includes setup.py support
618672a matcher: use pkg.PythonSetupPkg as well
3836626 add demo gif (#134)
d3987d7 Update modules (#127)
66b2512 Merge pull request #124 from anchore/issue-91
b237bf9 test: fuzzyConstraint needs a hint now, update tests
75b3537 version: use hint if provided
84684f2 test: add examples of crazy PEP440 rules
0399e08 version: use the new PythonFormat
41147df test: update integration validation for python packages with Python format
0618d1d github is picky about the issue template file extension
d0b03fa add slack links to issue selection (#123)
a34bf6e Merge pull request #122 from nwl/readme-fixes
f2ce94b Replaced stray syft entries with grype
93e39a7 Merge pull request #120 from anchore/readme-install-fix
2caa0d2 docs: emphasize installation methods before features and getting started
89a6201 Disable prerelease version update check (#118)
12b2296 Add future ideas + beta warning to README (#114)
8052fa6 Update installation method (#117)
56b9576 Add inline-comparison as acceptance test (#106)
f98e3cd replace search key from table with severity (#107)
37ceb17 Add shell completion script (#109)
2ccdefd Add poetry to package types (#108)
30d72dd fix spaces alignment on etui
c1fdaba Adding additional detail to README (#103)
f1ad989 replace master with main (#104)
6de7e40 finalize the json output (no schema yet) (#102)
76ff973 Merge pull request #99 from anchore/issue-18
5d057db cpe: update tests to match new ANY in product name
d8da43b test: update integration tests for alpine
e4689c6 matcher: add apk matcher unit tests
44767fc result: add a Count() helper method
4476fc9 broaden cpe matcher + modify alpine matcher
a9bf268 integration tests for corner case
cff46b8 add apk to controller
e0db0c1 test: add integration corner cases for Alpine
905cae5 matcher: add APK support
317b383 match: add APK matcher type
5147985 add description and cvss metadata to v1 schema (#100)
4e6eb13 fix panic on top-level log (#97)
81eab4e pull all commits on checkout for release to build changelog (#98)
f3756d0 change default scope to squashed (from all-layers) (#95)
0cfca60 Merge pull request #83 from anchore/initial-docs
57d73a5 docs: update README with sections and DB information
2cd127b Update pkg type (#87)
e1f4c54 bump syft for docker pull + UI elements for pull status (#81)
5261e4a Merge pull request #84 from anchore/help-error
c581a45 cmd: display help menu when no args are passed in - skip the error
87e6dc0 Merge pull request #82 from anchore/log-fix
b214c29 cmd: fix log identifier for stereoscope
fb8f3d8 restore log source after etui exit
11731fa replace zap logger with logrus (#80)
861883c pull in fix for bounds check progress formatting values in etui
Please file an issue or reach out on the issue board tagging @spiffcs if you need support, feature requests, bug fixes, or have ideas for future features and PR.
NOTE: if you are using this action within a matrix build and see failures attempting to upload artifacts with duplicate names, you will need to set the artifact-name to be unique based on the matrix properties (an example here). This is due to a change to use a newer GitHub API which no longer allows artifacts with duplicate names.
feat: add output-file option, default to random directory output in temp (#346) [kzantow]
The action no longer generates files in your working directory by default, instead you should use the action outputs: ${{ steps.<id>.outputs.sarif }} where the <id> needs to match the id you configured to reference the scan-action, e.g.:
feat: short-lived grype-db cache (#348) [kzantow]
Note: with this release grype is no longer installed on $PATH. We suspect the changes here could break a number of users of the action who have learned to expect Grype be installed on $PATH.
New major version of scan action based on new Grype tool from Anchore that is much faster for scanning compared to v1.x and adds some new capabilities and more metadata about the matches.
Significantly faster performance for scans
New vulnerabilities output format is the JSON output from Grype directly
Adds support for scanning directories as well as Docker containers, so you can do the same checks pre-and post-build of the container.
Supports Automatic Code Scanning/SARIF for exposing results via your repository’s Security tab.
This is a breaking change from v1.x, as indicated by the major version revision:
Use image input parameter Instead of image-reference
dockerfile-path is no longer supported and not necessary for the vulnerability scans
custom-policy-path is no longer supported
include-app-packages is no longer necessary or supported. Application packages are on by default and will receive vulnerability matches.
Outputs:
billofmaterials is no longer output. V2 is focused on vulnerability scanning and another action may be introduced for BoM support with its own options/config.
Bumps version of anchore used to v0.6.0 as well as adding an input parameter to enable overriding the Anchore inline scan version. Other updates are internal optimizations, test improvements, and code cleanup.
Definitions of terms used in software security, SBOM generation, and vulnerability scanning
A
Artifact
In Syft’s JSON output format, “artifacts” refers to the array of software packages discovered during scanning.
Each artifact represents a single package (library, application, OS package, etc.) with its metadata, version, licenses, locations, and identifiers like CPE and PURL.
This is distinct from general software artifacts like binaries or container images.
A cryptographically signed statement about a software artifact that provides verifiable claims about its properties, such as provenance, build process, or security scan results.
Attestations establish trust in the software supply chain by allowing you to verify that an SBOM truly represents a specific artifact or that vulnerability scan results are authentic.
Why it matters: Attestations enable you to verify the authenticity and integrity of SBOMs generated by Syft and vulnerability reports from Grype, ensuring they haven’t been tampered with.
C
Cataloger
A cataloger is a component within Syft that specializes in discovering and extracting package information from specific ecosystems or file formats.
Each cataloger knows how to find and parse packages for a particular type (e.g., apk-cataloger for Alpine packages, npm-cataloger for Node.js packages).
When Syft scans a target, it runs multiple catalogers to comprehensively discover all software components.
Why it matters: The foundBy field in Syft’s JSON output tells you which cataloger discovered each package, which can help debug why certain packages appear in your SBOM or troubleshoot scanning issues.
A lightweight, standalone, executable package that includes everything needed to run a piece of software, including the code, runtime, system tools, libraries, and settings.
Container images are built from layers and typically run using container runtimes like Docker or containerd. See also OCI.
Why it matters: Both Syft and Grype can scan container images directly without requiring them to be running. Syft generates SBOMs from container images, and Grype scans them for vulnerabilities.
Common Platform Enumeration (CPE) is a standardized method for describing and identifying software applications, operating systems, and hardware devices.
CPEs are used in vulnerability databases to match software components with known vulnerabilities.
Formats:
URI binding: cpe:/{part}:{vendor}:{product}:{version}:{update}:{edition}:{language}
Why it matters: Syft generates CPEs for discovered packages (from the NVD dictionary or synthetic generation), which Grype then uses to match packages against vulnerability data.
Understanding CPEs helps you troubleshoot why certain vulnerabilities are or aren’t being detected relative to vulnerabilities from NVD.
Common Vulnerabilities and Exposures (CVE) is a standardized identifier for publicly known security vulnerabilities.
Each CVE ID uniquely identifies a specific vulnerability and provides a common reference point for discussing and tracking security issues.
Format example: CVE-2024-12345
Why it matters: Grype reports vulnerabilities by their CVE IDs, making it easy to research specific issues, understand their impact, and find remediation guidance.
Each match in a Grype scan references one or more CVE IDs.
Common Vulnerability Scoring System (CVSS) is an open framework for communicating the characteristics and severity of software vulnerabilities.
CVSS (base) scores range from 0.0 to 10.0, with higher scores indicating more severe vulnerabilities.
Severity ranges:
None: 0.0
Low: 0.1-3.9
Medium: 4.0-6.9
High: 7.0-8.9
Critical: 9.0-10.0
There are more dimensions to CVSS, including Temporal and Environmental scores, but the Base score is the most commonly used as a way to quickly assess severity.
Why it matters: Grype uses CVSS scores to categorize vulnerability severity, helping you prioritize which issues to fix first.
You can filter Grype results by severity level to focus on the most critical vulnerabilities.
CycloneDX is an open-source standard for creating Software Bill of Materials (SBOMs), supporting JSON and XML representations.
Why it matters: Syft can generate SBOMs in CycloneDX format (-o cyclonedx-json or -o cyclonedx-xml), which is widely supported by security tools and compliance platforms.
Grype can also scan CycloneDX SBOMs for vulnerabilities.
A software component that another piece of software relies on to function. Dependencies can be direct (explicitly required by your code) or transitive (required by your dependencies).
Understanding and tracking dependencies is crucial for security and license compliance.
Why it matters: Syft catalogs both direct and transitive dependencies in your software, creating a complete inventory.
Grype then scans all dependencies for vulnerabilities, not just your direct dependencies—important because transitive dependencies often contain hidden security risks.
Distro
Short for “distribution”, referring to a specific Linux distribution like Alpine, Ubuntu, Debian, or Red Hat. The distro information includes the distribution name and version (e.g., “alpine 3.18”).
Why it matters: Grype uses distro information to match OS packages against the correct vulnerability database.
Syft automatically detects the distro from files like /etc/os-release and includes it in the SBOM, ensuring accurate vulnerability matching.
Docker is a platform for developing, shipping, and running applications in containers. While Docker is a specific implementation, the term is often used colloquially to refer to container technology in general.
See Container image and OCI.
Why it matters: Syft and Grype can pull and scan images directly from Docker registries or analyze images in your local Docker daemon without needing Docker to be installed.
In software, an ecosystem refers to a package management system and its associated community, tools, and conventions.
Examples include npm (JavaScript), PyPI (Python), Maven Central (Java), and RubyGems (Ruby).
Different ecosystems have different package formats, naming conventions, and vulnerability data sources.
Why it matters: Syft supports dozens of package ecosystems, and each uses a different cataloger.
The ecosystem determines how packages are identified (PURL type), which metadata is captured, and which vulnerability data sources Grype uses for matching.
Exploit Prediction Scoring System (EPSS) is a data-driven framework that estimates the probability that a software vulnerability will be exploited in the wild within the next 30 days.
EPSS scores range from 0 to 1 (or 0% to 100%), with higher scores indicating a greater likelihood of exploitation based on real-world threat intelligence.
Unlike CVSS which measures theoretical severity, EPSS predicts actual exploitation probability by analyzing factors like available exploits, social media activity, and observed attacks (among other signals).
Why it matters: EPSS helps you prioritize vulnerabilities more effectively than severity alone. A critical CVSS vulnerability with a low EPSS score might be less urgent than a medium severity issue with a high EPSS score.
Grype can display EPSS scores alongside CVSS to help you focus remediation efforts on vulnerabilities that are both severe and likely to be exploited.
In the context of scanning for vulnerabilities, a false positive is a vulnerability-package match reported by a scanner that doesn’t actually affect the software package in question.
False positives can occur due to incorrect CPE matching, version misidentification, or when a vulnerability applies to one variant of a package but not another.
Why it matters: When Grype reports a false positive, you can use VEX documents or Grype’s ignore rules to suppress it, preventing alert fatigue and focusing on real security issues.
False negative
In the context of scanning for vulnerabilities, a false negative occurs when a scanner fails to detect a vulnerability that actually affects a software package.
False negatives can happen when vulnerability data is incomplete, when a package uses non-standard naming or versioning, when CPE or PURL identifiers don’t match correctly, or when the vulnerability database hasn’t been updated yet.
Why it matters: False negatives are more dangerous than false positives because they create a false sense of security.
To minimize false negatives, keep Grype’s vulnerability database updated regularly and understand that no scanner catches 100% of vulnerabilities—defense in depth and multiple security controls are essential.
K
KEV
Known Exploited Vulnerability (KEV) is a designation for vulnerabilities that have been confirmed as actively exploited in real-world attacks.
CISA (Cybersecurity and Infrastructure Security Agency) maintains the authoritative KEV catalog, which lists CVEs with evidence of active exploitation and
provides binding operational directives for federal agencies.
The CISA KEV catalog includes:
CVE identifiers for exploited vulnerabilities
The product and vendor affected
A brief description of the vulnerability
Required remediation actions
Due dates for federal agencies to patch
Vulnerabilities are added to the KEV catalog based on reliable evidence of active exploitation, such as public reporting, threat intelligence, or incident response data.
Why it matters: KEV status is a strong signal for prioritization—these vulnerabilities are being actively exploited right now.
When Grype identifies a vulnerability that’s on the CISA KEV list, you should treat it as urgent regardless of CVSS score.
A medium-severity KEV vulnerability poses more immediate risk than a critical-severity vulnerability that’s never been exploited.
Some organizations make KEV remediation mandatory within tight timeframes (e.g., 15 days for critical KEVs).
Container images are built as a series of filesystem layers, where each layer represents changes from a Dockerfile instruction. Layers are stacked together to create the final filesystem.
Why it matters: By default, Syft scans only the “squashed” view of an image (what you’d see if the container were running).
Use --scope all-layers to scan all layers, which can reveal packages that were installed then deleted, potentially exposing vulnerabilities in build-time dependencies.
A legal instrument governing the use and distribution of software. Software licenses range from permissive (MIT, Apache) to copyleft (GPL) to proprietary.
Why it matters: Syft extracts license information from packages and includes it in SBOMs, helping you ensure compliance with open source licenses and identify packages with incompatible or restricted licenses.
M
Match
A match is a vulnerability finding in Grype’s output, representing a single package-vulnerability pair. Each match indicates that a specific package version is affected by a particular CVE.
A matcher is a component within Grype that compares package information against vulnerability data using specific matching strategies.
Different matchers handle different package types or ecosystems (e.g., distro matcher for OS packages, language matcher for application dependencies).
Why it matters: Grype uses multiple matchers to ensure comprehensive vulnerability coverage. The matcher used for each finding is included in detailed output, helping you understand how the match was made.
N
NVD
National Vulnerability Database (NVD) is the U.S. government repository known software vulnerabilities.
It provides comprehensive vulnerability information including CVE IDs, CVSS scores, and affected software configurations. The NVD is maintained by NIST.
Why it matters: The NVD is one of the primary vulnerability data sources used by Grype. Syft also uses the NVD’s CPE dictionary to generate CPEs for packages, enabling accurate vulnerability matching.
Open Container Initiative (OCI) is an open governance structure for creating industry standards around container formats and runtimes.
The OCI Image Specification defines the standard format for container images, ensuring interoperability across different container tools and platforms.
Why it matters: Syft and Grype work with OCI-compliant images from any registry (Docker Hub, GitHub Container Registry, Amazon ECR, etc.), not just Docker images. They can read images in OCI layout format directly from disk.
A bundle of software that can be installed and managed by a package manager. Packages typically include the software itself, metadata (like version and dependencies), and installation instructions.
Packages are the fundamental units tracked in an SBOM.
Why it matters: Every entry in a Syft-generated SBOM represents a package. Grype matches packages against vulnerability data to find security issues.
Understanding what constitutes a “package” in different ecosystems helps you interpret SBOM contents.
Package manager
A tool that automates the process of installing, upgrading, configuring, and removing software packages.
Examples include npm, pip, apt, yum, and Maven. Package managers maintain repositories of available packages and handle dependency resolution.
Why it matters: Syft discovers packages by reading package manager metadata files (like package.json, requirements.txt, or /var/lib/dpkg/status).
Each package manager stores information differently, which is why Syft needs ecosystem-specific catalogers.
Provenance
Information about the origin and build process of a software artifact, including who built it, when, from what source code, and using what tools.
Build provenance helps verify that software was built as expected and hasn’t been tampered with.
Why it matters: SBOMs generated by Syft during builds can be combined with provenance information to create comprehensive supply chain attestations, enabling you to verify both what’s in your software and how it was built.
Package URL (PURL) is a standardized way to identify and locate software packages across different package managers and ecosystems.
PURLs provide a uniform identifier that works across different systems.
Why it matters: Syft generates PURLs for all discovered packages, and Grype uses PURLs as one of the primary identifiers for vulnerability matching.
PURLs provide a consistent way to refer to packages across different SBOM formats.
In Syft’s JSON output, relationships describe connections between artifacts (packages), files, and sources (what was scanned).
For example, a relationship might indicate that a file is “contained-by” a package, or that one package “depends-on” another.
Why it matters: Relationships provide the graph structure of your software, showing not just what packages exist but how they’re connected.
This is essential for understanding dependency chains and reachability analysis.
Software Bill of Materials (SBOM) is a comprehensive inventory of all components, libraries, and modules that make up a piece of software.
Like a list of ingredients on food packaging, an SBOM provides transparency into what’s included in your software, enabling security analysis, license compliance, and supply chain risk management.
Why it matters: Syft generates SBOMs that you can use with Grype for vulnerability scanning, share with customers for transparency, or use for license compliance.
SBOMs are becoming required by regulations and standards like Executive Order 14028.
A classification of how serious a vulnerability is, typically based on CVSS scores.
Common severity levels are Critical, High, Medium, Low, and Negligible (or None).
Why it matters: Grype reports vulnerability severity to help you prioritize remediation efforts. You can filter Grype output by severity (e.g., --fail-on high to fail CI builds for high or critical vulnerabilities).
The software supply chain encompasses all the components, processes, and steps involved in creating, building, and delivering software.
This includes source code, dependencies, build tools, CI/CD pipelines, and distribution mechanisms.
Securing the software supply chain helps prevent attacks that target the development and delivery process.
Why it matters: Syft and Grype are key tools in supply chain security. Syft provides visibility into what’s in your software (SBOM), and Grype identifies known vulnerabilities, helping you secure each link in the chain.
Source
In Syft’s JSON output, the “source” object describes what was scanned—whether it was a container image, directory, file archive, or other input. It includes details like image name, digest, and tags.
Why it matters: The source information helps you correlate SBOMs with specific artifacts, especially important when tracking multiple image versions or builds.
Software Package Data Exchange (SPDX) is an open standard for communicating software bill of materials information, including components, licenses, copyrights, and security references.
SPDX is an ISO/IEC standard (ISO/IEC 5962:2021) and supports multiple formats including JSON, YAML, XML, and tag-value.
Why it matters: Syft can generate SBOMs in SPDX format (-o spdx-json or -o spdx-tag-value), which is widely supported by compliance tools and required by many organizations and regulations.
Grype can also scan SPDX SBOMs for vulnerabilities.
The “squashed” view of a container image represents the final filesystem that would be visible if you ran the container.
It’s the result of applying all image layers in sequence, where later layers can override or delete files from earlier layers.
Why it matters: Syft scans the squashed view by default (what you actually run), but you can use --scope all-layers to also see packages that existed in intermediate layers but were deleted before the final image.
Vulnerability Exploitability eXchange (VEX) is a series of formats for communicating information about the exploitability status of vulnerabilities in software products.
VEX documents allow software vendors to provide context about whether identified vulnerabilities actually affect their product, helping users prioritize remediation efforts.
Why it matters: Grype can consume VEX documents to suppress false positives or provide additional context about vulnerabilities.
When Grype reports a vulnerability that doesn’t actually affect your application, you can create a VEX document explaining why it’s not exploitable.
A security weakness, flaw, or defect in software that can be exploited by an attacker to perform unauthorized actions, compromise systems, steal data, or cause harm.
Vulnerabilities can arise from coding errors, design flaws, misconfigurations, or outdated dependencies with known security issues.
Not all vulnerabilities affect all users of a package. Whether a vulnerability impacts you depends on:
The specific version you’re using
Which features or code paths you actually invoke
Your deployment configuration and environment
Whether compensating security controls are in place
Why it matters: Grype identifies vulnerabilities in the packages discovered by Syft, enabling you to find and fix security issues before they can be exploited.
Not all vulnerabilities are equally serious—use severity ratings (CVSS) and exploitation probability (EPSS) to prioritize fixes.
Understanding the context of a vulnerability helps you assess real risk rather than just responding to every CVE.
A repository of known security vulnerabilities, their affected software versions, severity scores, and remediation information.
Vulnerability databases aggregate data from multiple sources like NVD, security advisories, and vendor bulletins.
Why it matters: Grype downloads and maintains a local vulnerability database that’s updated daily.
The database quality directly impacts scan accuracy—Grype uses curated, high-quality data from multiple providers to minimize false positives and false negatives.
A tool that identifies known security vulnerabilities in software by comparing components against vulnerability databases.
Vulnerability scanners like Grype analyze software artifacts (container images, filesystems, or SBOMs) and report potential security issues that should be addressed.
Why it matters: Grype is a vulnerability scanner that works seamlessly with Syft-generated SBOMs.
You can scan images directly with Grype, or generate an SBOM with Syft first and scan it separately, enabling workflows where SBOMs are generated once and scanned multiple times as new vulnerabilities are discovered.