Arvados is a platform for storing, organizing, processing, and sharing genomic and other big data, at scale.

Run anywhere: Arvados supports running in the cloud on AWS, Azure and GCP, as well as on premise.
Auto-scaling of compute resources in the cloud: Arvados can scale compute resources dynamically on AWS, Azure and GCP.
Large scale: a single Arvados instance can store petabytes of data and use thousands of cores of compute simultaneously.
Everything is an API: Arvados is designed to be integrated with existing infrastructure.
Commercially supported: Curii Corporation provides managed Arvados installations as well as commercial support for Arvados. Please contact for more information.

Try it

Installation options

Arvados can be installed in a number of ways, as documented on the Installation options page in the Arvados documentation.

Source code

The Arvados source code is available on Github.



Keep is a content-addressable storage system for managing and storing large collections of files with durable, cryptographically verifiable references and high-throughput processing. Keep works on a wide range of underyling file systems. Learn More >


Is a container orchestration engine for running complex, multi-part workflows in a way that is flexible, scalable, and supports versioning, reproducibilty, and provenance. In a cloud environment, Crunch scales compute dynamically. Learn More >

Standards Efforts

The Arvados community is collaborating closely with several standards efforts.

Common Workflow Language

The goal of the CWL project is to create specifications that enable data scientists to describe analysis tools and workflows that are powerful, easy to use, portable, and support reproducibility.

Global Alliance

The Global Alliance for Genomics and Health (GA4GH) is a global standards body defining data formats and APIs for precision medicine.