Serverless ML Training in Your Own Cloud
Define training pipelines in regular Python and dispatch them to arbitrary compute. Iterable, debuggable, DSL-free, and infrastructure agnostic.
Dispatch execution to any compute
Developers request compute that is launched from Kubernetes, elastic compute, bare metal, or a mixture.
import runhouse as rh # Define a cluster that you want to launch # This can be provisioned from elastic compute # or use existing Kubernetes clusters or VMs my_cluster = rh.cluster( name="rh-a10x", instance_type="A10G:1", memory="32+", provider="aws" ).up_if_not() # Save and reuse the cluster across multiple pipeline steps, # pipelines, or simply to ensure reproducibility. my_cluster.save() # Later... load your saved clusters for future use my_cluster = rh.cluster(name="rh-a10x").up_if_not()
Flexible and debuggable ML pipelines
Deploy code updates in less than 5 seconds and get streaming logs back for fast, iterable development.
# Define your model class using normal code class MyModelClass: def train(): .. def predict(): .. def save(): .. # Send your class to remote RemoteClass = rh.module(MyModelClass).to(my_cluster) # Instantiate and call an instance of the remote class RemoteModel = RemoteClass() RemoteModel.train()
Break the barrier between research and production
Manage your ML lifecycle using software development best practices on regular code. Deploy with no extra translation.
$ git add MyModelClass.py $ git commit -m "Refactor the train method" $ git push $ echo "Develop code, not orchestrator pipelines"
High-quality telemetry, out of the box
Automatically persist logs, track GPU/CPU/memory utilization, and audit resource access.
# API route to fetch logs for a resource @router.get( "/{uri}/logs", response_description="Resource logs retrieved", ) @send_event async def resource_logs_preview(...): ... # API route to fetch cluster status and metrics @router.get( "/{uri}/cluster/status", response_description="Cluster status retrieved", ) @send_event async def load_cluster_status(...): ...
import runhouse as rh # Define a cluster that you want to launch # This can be provisioned from elastic compute # or use existing Kubernetes clusters or VMs my_cluster = rh.cluster( name="rh-a10x", instance_type="A10G:1", memory="32+", provider="aws" ).up_if_not() # Save and reuse the cluster across multiple pipeline steps, # pipelines, or simply to ensure reproducibility. my_cluster.save() # Later... load your saved clusters for future use my_cluster = rh.cluster(name="rh-a10x").up_if_not()
Runhouse
Effortlessly program powerful ML systems across arbitrary compute in regular Python.
Works with your stack
Easily integrate within your existing pipelines, code, and development workflows.
$
pip install runhouse
Loved by research and infra teams alike
Runhouse is built for end-to-end ML development. Dispatch work quickly during local development in notebooks or IDEs, but run as-is inside Kubernetes, CI, or your favorite orchestrator. No more push and pray.
Runs inside your own infrastructure
Execution stays inside your cloud(s), with total flexibility to cost-optimize or scale to new providers.
Use Cases
ML that Runs
An ML platform that improves developer experience while increasing development velocity.
Without Runhouse:
Research is launched on siloed compute, sampled data, and notebook code to enable iterative development. Production is reached via a slow translation to orchestrators, and becomes difficult to debug when errors arise.
With Runhouse: Fast Software Development
Code is written and executed identically in research and production. Errors can be debugged on a branch from local IDEs and merged into production using a standard development lifecycle.
Operationalize your ML as a living stack.
Try it outSearch & Sharing
Runhouse Den provides you with an interface to explore your ML artifacts. Easily share with your teammates, search through your resources, and view details all in one place.
Observability
View cluster status, automatically persist logs, track GPU/CPU/memory utilization, and enable more efficient debugging for your team. Gain insights with trends and simple dashboards.
Auth & Access Control
Den makes it easy to control access to your services. Provide individual teammates with read or write access, or choose "public" visibility to expose public API endpoints.