Arvados 3.1.0 Release Notes
March 20, 2025
The Arvados team is pleased to announce Arvados 3.1.0. This release introduces support for AMD ROCm workflows, continued Workbench optimization, and richer data management in Arvados command-line tools, as well as bug fixes throughout the platform. We recommend that existing installations of 3.0.0 or earlier upgrade to 3.1.0. See Upgrading Arvados for instructions.
Arvados API
Arvados now supports containers that rely on AMD ROCm GPU support. It works much like our existing support for NVIDIA CUDA: container requests can declare a dependency on AMD ROCm, along with their hardware requirements, and Crunch will dispatch the container accordingly.
As part of this, API attributes and configuration settings that previously referred to “CUDA” now refer to “GPU.” The API server still accepts container requests that reference the old CUDA attributes and will translate as needed. Any clients that read these fields will need to be updated. Refer to the upgrade notes for details. #21926, #22563, #22568, #22612
Collection create and update methods accept a new replace_segments
parameter. This lets clients more efficiently repack the block in a collection. Future releases will see Keep components use this to optimize collections as they’re built. Refer to the replace_segments
reference for details. #22319
The Arvados controller can forward requests to a specific port on a running container. This works like the existing container logs and shell functionality. Future releases will use this functionality to let users interact with services on long-running containers. #17209
Workbench
Several Workbench internals have been reworked to improve responsiveness and rendering speed throughout the entire application, with special focus on the most common components like table listings and navigation trees. #22127, #22159
Status updates for running processes are more reliable. #22116
The left-hand navigation panel clips its contents instead of becoming scrollable when listing names are too long to fit in the space. #22566, #22624
Users can resize the right-hand details panel. Workbench remembers the user’s preferred size for both it and the left-hand navigation panel. #22336
Autocomplete dropdowns throughout Workbench can now be scrolled and size themselves to stay within the browser window. This makes it easier to make selections even when the listing is too long to display. #22358
Context menus and action toolbars have been made more consistent throughout Workbench. #22051, #22593
My Favorites no longer lists items in the trash. #22000
Fixed a bug where toolbar actions might work on an object other than the one being displayed. #22408
Fixed a bug that could cause the left-hand navigation to disappear when expanding My Favorites. #22473
Fixed a bug that could cause the actions toolbar to be cut off when opening the right-hand details panel or resizing the browser window. #22359
Command-line Tools
arv-copy
supports a --replication
option to set the desired replication level of copied collections. GitHub PR #247, #22008
The --storage-classes
and --intermediate-storage-classes
options of various tools use the cluster’s configured default storage classes rather than assuming a class named default
. GitHub PR #249, #22009
arv-copy
looks for a cluster’s credentials in settings.conf
if it does not find them in the cluster-specific configuration file. #22602
arvados-cwl-runner
redacts credentials from Git remote URL workflow metadata. #22660
Fixed several bugs that could cause arv-mount
to serve stale file data after a collection update on the server. A new option --refresh-time
lets you configure some of this behavior. #22420
Updated Ruby tools’ version dependency on the arvados
gem. #22364
Packaging and Deployment
The arvados-api-server
package includes a systemd service definition to run the server using the bundled Passenger. This means you no longer need to install the third-party Passenger package or configure nginx to serve it. Administrators should refer to the upgrade notes for details about how to migrate their installations. #22349, #22396, #22614
The arvados-api-server
package will exit the post-installation script with a failure status if it cannot complete configuration to signal that problem to orchestration tools like Salt and Ansible. #22433
The compute node image builder script has been replaced. Instead of configuring the build with command line switches, you run Packer with your cluster configuration and a second YAML configuration file for Ansible. Refer to the compute image build documentation for details. #22217, #22317
The example parameters for a single-node Salt install explicitly list roles for clarity. #22298
The arvados-api-server
package post-installation script no longer fails on errors from the gem install
that it runs before bundle install
. The former can run into gem conflicts as a server accumulates gems over time. Bundle should be able to work around these situations, and cause the script to fail with an error if it can’t. #22647
Crunch
Crunch supports a new configuration option Containers.CloudVMs.DeployRunnerDirectory
to specify where the crunch-run
binary should be stored on compute nodes. This can be used to dynamically deploy crunch-run
on cloud nodes where /tmp
is mounted with the noexec
option. #22029
crunch-run
retries failed API and Keep operations for longer to try to preserve container results. #22455
Fixed a bug where arvados-cwl-runner
could set an incorrect output_glob
for a container request when a workflow step’s secondary files were generated from an expression. #22466
Fixed a bug where arvados-cwl-runner
would crash if a container update was successfully processed but it did not receive a valid response from the API server. #22160
Fixed a bug that could cause arvados-dispatch-cloud
to hang at startup while fetching spot instance prices. #22400
crunch-run
has new logic to store and load cached Singularity images to prevent crashes if other processes are updating the cache collection at the same time. #20605
Reworded some misleading log messages in crunch-run.txt
when converting Docker images to run with Singularity. #20605
Reworded the log message when Crunch encounters an error when checking for spot instance interruptions to clarify that it does not directly affect the running process. #22434
Removed “tunnel connection started/finished” log messages that were repeated a lot and minimally helpful. #22431
crunch-dispatch-local
does basic resource accounting. It’s still not suitable for production deployments, but this lays some groundwork to make it useful for single-node installs. #22314, #21926
Servers
The API.MaxIndexDatabaseRead
setting is consistently applied to all API list requests, particularly for logs. #22232
All servers in a federation report the correct expires_at
time for remote API client authorizations. #22228
The API server returns a 500 Internal Server Error when it encounters various database errors, including deadlocks, to let clients know they can retry the request. #21547, #22476
API error messages will no longer include development suggestions when the server is running Ruby 3.1 or later. #22407
The keepstore
index API no longer respects the configured API.RequestTimeout
since it’s expected to take a long time by design. #22411
The API server indexes container requests by name+owner to improve performance for this common query. #22327
The API server disables statement timeouts when running database migrations. This will prevent timeouts for any long-running migrations added in future versions. #22435
The default setting for API.MaxConcurrentRailsRequests
has been increased from 8 to 16 to avoid deadlocking on some common client access patterns. #22414
Removed a confusing warning log when API requests included unsupported parameters. #21743
Removed an unused index that was accidentally added in 3.0.0. #22467
Security Improvements
The Arvados Rails API server uses Rails 7.1 and Passenger 6.0.26 to address CVE-2025-26803. #22363, #22608
The Arvados Rails API server uses Rack 3.1.12 to address CVE-2025-27610 and CVE-2025-27111. #22657
Arvados is built using Go 1.23 to address various security issues in older versions. #22422
Development Changes
The interactive test runner detects whether a graphical interface is available, and does not run Cypress in interactive mode if it isn’t. #22316
Fixed several inconsistencies in the list of tests presented by the test runner. #22428, #22506
The Arvados source now includes an Ansible playbook to install and configure all the software necessary to run the Arvados test suite. We are using this playbook in CI and expect it will replace arvados-server install
in a future release. The Hacking prerequisites wiki describes how to use it. #22318, #22437, #22489
arvados-cwl-runner
provides a plugin for cwltest
to read keep:
locations. This makes it possible to run arvados-cwl-runner
against the latest versions of the CWL conformance suite. #22058
arvados-server install
provides a -user-account
option to automatically add a user to the docker
group. #22316
arvados-server install
installs Singularity from a source archive instead of Git to improve reliability. #22644
When our test runner claims a port for an Arvados service to use, it more strictly checks that the expected service listens on that port, and explicitly fails if that does not happen within a few minutes. #22655
Fixed a Rails API server test that could fail if you had previously built packages in your source tree. #22424
Reworked controller’s login integration tests to work on more distributions and in more test environments. #22406
Fixed a Python SDK test that could fail depending on the filesystem settings of /tmp
on the test system. #20909
Fixed an arvados-server boot
test so it only checks IPv6 connectivity when the test host supports it. #22567
Fixed several Workbench issues that caused React warnings. #22231
Improved the reliability of several Workbench tests. #22483, #22545
When we build Docker images for testing and deployment, we consistently use --mount
instead of --volume
to avoid bugs caused by creating new empty mount points. #22567