Arvados 2.7.1 Release Notes

December 12, 2023

The Arvados team is pleased to announce Arvados 2.7.1. This release includes a variety of improvements related to user interface, scalability, performance, and security. We recommend that new and existing installations of 2.7.0 or earlier upgrade to 2.7.1. See Upgrading Arvados for instructions.

Workbench

Workbench 2 project listings now support multi-select. Users can select multiple items from the listing and take action on all of them at once, like moving those items to another project or the trash. #15768

Arvados Workbench screenshot of a project listing with multiple items selected and associated actions available

Revamped the left pane of Workbench 2 to make everything easier to navigate (#19302):

  • Added a “Shell access” item and reordered the list
  • More items are expandable, and the expanded list is presented with most recently used items first
  • When the pane is collapsed, icons provide quick access to top-level navigation
  • The pane is easier to resize
  • The collapse icon renders more consistently
Arvados Workbench screenshot of the collapsed left navigation pane with quick access icons Arvados Workbench screenshot of the reorganized left navigation pane

The subprocess pane in Workbench 2 shows a progress bar summarizing the status of all this process’ subprocesses. #20609

Arvados Workbench screenshot of a subprocess pane with a progressbar of subprocesses in various states

When users launch a workflow with directory inputs, Workbench 2 lets the user choose a directory from a collection. #20225

Arvados Workbench screenshot of the directory input selector for a workflow

When a user views a process whose output or log collections have been deleted, Workbench 2 pops up a small info box with more details. This display is both more specific and less intrusive than the generic “Not Found” error dialog that appeared before. #21067

Workbench 2 displays listings in a consistent order when you sort the display by a field with duplicate values. #20526

Reduced visual noise in Workbench 2 by inhibiting the spinner when it loads data in the background. #21077

Compute Scale and Performance Improvements

The Crunch cloud dispatcher will consider using multiple instance types to satisfy a queued container request. It will consider all configured instance types that satisfy the request’s runtime requirements, from the cheapest up to a price limit configured by Containers.MaximumPriceFactor (default 1.5, 150%). This lets Crunch continue dispatching requests even when the best-fitting instance type is unavailable. #20978

Improved the scalability of the Crunch cloud dispatcher by refining how it reacts when the cloud provider reports there is no more capacity for a specific instance type. #20984

Introduced a controller configuration setting API.MaxConcurrentRailsRequests to limit how many requests are forwarded to the Rails API server at once. Default 8. With this change, the default value for API.MaxConcurrentRequests has been restored to 64 (after being lowered to 8 in Arvados 2.7.0). #21124

Improved the API server logic to ensure name uniqueness so it can better handle many requests with the same name arriving around the same time. Where it previously used a timestamp in the revised name, it now uses part of the object UUID. #21205

The installer defaults to configuring S3 Keep volumes with DriverParameters.PrefixLength set to 3. This provides better storage scaling on larger installs. If you are upgrading an existing install, make sure any existing volumes retain their current PrefixLength setting. See the comment that accompanies this setting in tools/salt-install/config_examples/multi_host/aws/pillars/arvados.sls. #21125

Improved the performance of the code that updates user permissions. In clusters with large numbers of users, user creation and adding a user to a group may be up two orders of magnitude faster. #21030, #21160

Improved the performance of common user API queries by adding a database btree index to the name column of collections and groups. #20990

Fixed several bugs in keep-web that could cause it to crash with an “unlock of unlocked mutex” panic message under heavy concurrency. #21227

Security

When a user account is unsetup on the login cluster of a federation, it will also be unsetup on satellite clusters. This helps prevent situations where the account could retain previous permissions after being reactivated. #20831

When a user logs out of a satellite cluster in a federation, their request will be sent to the login cluster, and that response given priority. This helps prevent situations where a user intended to expire their API token but it’s still usable across the federation because it remains valid on the login cluster. #21021

When using an OpenID for authentication and a user logs out of an Arvados cluster, they’ll be redirected to their OpenID provider’s logout endpoint if it is known. This gives users the opportunity to end their single sign-on session, so that a malicious user cannot log back in without re-entering credentials. #21137

When keep-web receives a request with API tokens in the URL, it redirects the user to a URL without any of those tokens. Previously the redirect URL only had the first token removed, which had the potential to leave valid credentials visible if other software built a link from a malicious keep-web URL. #21025

The uniqueness of the username field in user records is now decided by the login cluster in a federation. This helps prevent situations where different clusters in a federation think different users have a specific username. #20284

Improved Workbench 2’s defenses against XSS and similar attacks by sanitizing user-provided HTML such as the description field of projects and collections. #21026

Upgraded various Workbench 2 dependencies in response to security advisories in those libraries. #21033

Command Line Tools

arv-copy can copy data from HTTP(S) URLs to collections. This works similarly to the arvados-cwl-runner feature, so can be used to more easily import data to Arvados for analysis. See the user guide for more information. #20937

Made arv-copy more complete by extending it to follow CWL references inside a primary workflow. #20933

arvados-cwl-runner supports an extension hint arv:SeparateRunner to run subworkflows in a dedicated process. This causes subworkflows to be presented as their own workflows in Workbench and other Arvados UI, at the cost of a little compute overhead. #20825

arv-mount automatically raises its own open files limit as needed to accommodate any disk cache size requested by the user. Before this, the open files limit meant the maximum effective size of the disk cache was 80GiB, even when the user requested more. #21223

arvados-client and other tools written in Go will try to load CA certificates from /etc/arvados/ca-certificates.crt if it exists to better match the behavior of other Arvados software. #21086

arvados-server install logs a warning instead of failing immediately if it cannot save sysctl settings to better accommodate development systems and build pipelines. #21055

SDKs

Improved the reliability of outgoing request concurrency in the Go SDK by fixing a bug where a client object could get its own request limiter, causing the application to send more requests concurrently than expected. #21227

Improved the scale of outgoing request concurrency in the Go SDK by using one default global request limiter per API server, so requests to different clusters don’t limit each other. #21227

The Go SDK automatically trims whitespace around user-provided tokens. Whitespace characters are never valid in Arvados API tokens, but is sometimes introduced when users copy and paste across applications, so this makes Go servers and tools more user-friendly without sacrificing security. #21217

The Go SDK no longer retries “invalid outgoing header” request errors, since they will never succeed. #21217

Improved compatibility with other software by ensuring the Go SDK never generates a URL with a path that starts with //. #21217

The Python SDK reference has been expanded:

  • The arvados.api, arvados.collection, and arvados.util modules are fully documented. #19818, #19821, #19830
  • The arvados.api_resources documentation covers request objects and their API for completeness. #21132
  • The documentation in the arvados.retry and arvados.safeapi modules has been updated for improved display and consistency. #20885, #21211

The Java SDK now has timeouts as configuration parameters github PR 220 and an option to upload InputStream to KeepWebApi github PR 221. NOTE: due to a release process mistake, the correct version of this package to use is “2.7.1.1”.

Keep

Administrators can put a keep volume in maintenance mode by setting a new configuration flag Volumes.*.AllowTrashWhenReadOnly. By setting this along with the existing ReadOnly flag, the store will not accept any new data, but still delete blocks as the collections that use them are deleted. #21126

Administrators can control how much RAM keepstore must use to handle updates from keep-balance with the new settings Collections.BalancePullLimit and Collections.BalanceTrashLimit. These limit the number of items keep-balance will send to keepstore in a single pull or trash update. Larger settings will use more RAM to finish balancing blocks in less time, but leave less available for regular work. #21189