You are viewing v0.0.12 version. Click here to see docs for the latest stable version.

Compute: Clusters, Functions, Packages, & Envs

Runhouse has several abstractions to provide a seamless flow of code and execution across local and remote compute. The abstractions are cloud provider-agnostic, and provide living, reusable services.

The Cluster and Function APIs blur the line between program execution and deployment.

The Env and Package APIs help to provide convenient dependency isolation and management.

Install Runhouse

!pip install runhouse

import runhouse as rh

Optionally, to login to Runhouse to sync any secrets.

!runhouse login

Cluster

Runhouse provides various APIs for interacting with remote clusters, such as terminating an on-demand cloud cluster or running remote CLI or Python commands from your local dev environment.

Initialize your Cluster

There are three types of supported cluster types:

Bring-your-own (BYO) Cluster: Existing clusters that you already have up, and access through an IP address and SSH credentials. Please refer to the Bring-Your-Own Cluster section for further instructions.
On-demand Cluster: Associated with your cloud account (AWS, GCP, Azure, LambdaLabs). There are additional features for these clusters, such as cluster (auto) stop. Please refer to On-Demand Cluster for instructions on first getting cloud credentials set up.
SageMaker Cluster (Alpha): Clusters that are created and managed through SageMaker, which can be used as a compute backend (just like BYO or On-Demand clusters) or for running dedicated training jobs. Please refer to the SageMaker Cluster section for instructions on getting setup with SageMaker.

Each cluster must be provided with a unique name identifier during construction. This name parameter is used for saving down or loading previous saved clusters, and also used for various CLI commands for the cluster.

# BYO cluster
cluster = rh.cluster(  # using private key
              name="cpu-cluster",
              ips=['<ip of the cluster>'],
              ssh_creds={'ssh_user': '<user>', 'ssh_private_key':'<path_to_key>'},
          )

cluster = rh.cluster(  # using password
              name="cpu-cluster",
              ips=['<ip of the cluster>'],
              ssh_creds={'ssh_user': '<user>', 'password':'******'},
          )

# Using a Cloud provider
cluster = rh.ondemand_cluster(
              name="cpu-cluster",
              instance_type="CPU:8",
              provider="cheapest",       # "AWS", "GCP", "Azure", "Lambda", or "cheapest" (default)
              autostop_mins=60,          # Optional, defaults to default_autostop_mins; -1 suspends autostop
          )
# Launch the cluster
cluster.up()

# Using SageMaker as the compute provider
cluster = rh.sagemaker_cluster(
              name="sm-cluster",
              profile="sagemaker" # AWS profile with a role ARN configured for SageMaker
          )
# Launch the cluster
cluster.up()

You can set default configs for future cluster constructions. These defaults are associated with either only your local environment (if you don’t login to Runhouse), or can be reused across devices (if they are saved to your Runhouse account).

rh.configs.set('use_spot', False)
rh.configs.set('default_autostop', 30)

rh.configs.upload_defaults()

Useful Cluster APIs

To run CLI or Python commands on the cluster:

cluster.run(['pip install numpy && pip freeze | grep numpy'])

INFO | 2023-08-29 03:35:44.910826 | Running command on cpu-cluster: pip install numpy && pip freeze | grep numpy
Warning: Permanently added '34.205.23.213' (ED25519) to the list of known hosts.

Requirement already satisfied: numpy in /opt/conda/lib/python3.10/site-packages (1.25.2)
numpy==1.25.2

[(0,
  'Requirement already satisfied: numpy in /opt/conda/lib/python3.10/site-packages (1.25.2)nnumpy==1.25.2n',
  "Warning: Permanently added '34.205.23.213' (ED25519) to the list of known hosts.rn")]

cluster.run_python(['import numpy', 'print(numpy.__version__)'])

INFO | 2023-08-29 03:35:50.911455 | Running command on cpu-cluster: python3 -c "import numpy; print(numpy.__version__)"

1.25.2

[(0, '1.25.2n', '')]

To ssh into the cluster:

# Python
cluster.ssh()

# CLI
!ssh cpu-cluster

To tunnel a JupyterLab server into your local browser:

# Python
cluster.notebook()

# CLI
!runhouse notebook cpu-cluster

To open a port, if you want to run an application on the cluster that requires a port to be open, e.g. Tensorboard, Gradio:

cluster.ssh_tunnel(local_port=7860, remote_port=7860)

Function

Runhouse’s Function API lets you define functions to be run on remote hardware (your cluster above!). Simply pass in a local (or a GitHub) function, the intended remote hardware, and any dependencies; Runhouse will handle the rest for you.

Basic Functions

Let’s start with a simple local function getpid, which takes in an optional parameter a and returns the process ID plus a.

# Local Function
def getpid(a=0, b=0):
    import os
    return os.getpid() + a + b

To construct a function that runs getpid on a remote cluster, we wrap it using rh.function, and specify system=cluster.

# Remote Function
getpid_remote = rh.function(fn=getpid).to(system=cluster)

INFO | 2023-08-29 03:59:14.328987 | Writing out function function to /Users/caroline/Documents/runhouse/runhouse/docs/notebooks/basics/getpid_fn.py. Please make sure the function does not rely on any local variables, including imports (which should be moved inside the function body).
INFO | 2023-08-29 03:59:14.332706 | Setting up Function on cluster.
INFO | 2023-08-29 03:59:14.807140 | Connected (version 2.0, client OpenSSH_8.2p1)
INFO | 2023-08-29 03:59:15.280859 | Authentication (publickey) successful!
INFO | 2023-08-29 03:59:17.534412 | Found credentials in shared credentials file: ~/.aws/credentials
INFO | 2023-08-29 03:59:18.002794 | Checking server cpu-cluster
INFO | 2023-08-29 03:59:19.059074 | Server cpu-cluster is up.
INFO | 2023-08-29 03:59:19.061851 | Copying package from file:///Users/caroline/Documents/runhouse/runhouse to: cpu-cluster
INFO | 2023-08-29 03:59:20.822780 | Calling env_20230829_035913.install

base servlet: Calling method install on module env_20230829_035913
Installing package: Package: runhouse
Installing Package: runhouse with method reqs.
reqs path: runhouse/requirements.txt
pip installing requirements from runhouse/requirements.txt with: -r runhouse/requirements.txt
Running: /opt/conda/bin/python3.10 -m pip install -r runhouse/requirements.txt

INFO | 2023-08-29 03:59:22.728154 | Time to call env_20230829_035913.install: 1.91 seconds
INFO | 2023-08-29 03:59:22.981633 | Function setup complete.

To run the function, simply call it just as you would a local function, and the function automatically runs on your specified hardware!

print(f"local function result: {getpid()}")
print(f"remote function result: {getpid_remote()}")

INFO | 2023-08-29 03:59:43.821391 | Calling getpid.call

local function result: 7592
base servlet: Calling method call on module getpid

INFO | 2023-08-29 03:59:44.078775 | Time to call getpid.call: 0.26 seconds

remote function result: 1382396

Git Functions

A neat feature of Runhouse is the ability to take a function from a Github repo, and create a wrapper around that function to be run on remote. This saves you the effort of needing to clone or copy a function. To do so, simply pass in the function url into rh.function.

We’ve implemented the same getpid function here. Below, we demonstrate how we can directly use the GitHub link and function name to run this function on remote hardware, without needing to clone the repo ourselves or reimplement the function locally.

pid_git_remote = rh.function(
    fn='https://github.com/run-house/runhouse/blob/main/docs/notebooks/sample_fn.py:getpid',
    system=cluster,
)

INFO | 2023-08-29 04:00:01.870718 | Setting up Function on cluster.
INFO | 2023-08-29 04:00:01.873021 | Copying package from file:///Users/caroline/Documents/runhouse/runhouse to: cpu-cluster
INFO | 2023-08-29 04:00:03.145979 | Copying package from file:///Users/caroline/Documents/runhouse/runhouse to: cpu-cluster
INFO | 2023-08-29 04:00:04.625905 | Calling env_20230829_035957.install

base servlet: Calling method install on module env_20230829_035957
Installing package: GitPackage: https://github.com/run-house/runhouse.git@main
Pulling: git -C ./runhouse fetch https://github.com/run-house/runhouse.git
Checking out revision: git checkout main
Installing GitPackage: https://github.com/run-house/runhouse.git@main with method local.
Installing package: Package: runhouse
Installing Package: runhouse with method reqs.
reqs path: runhouse/requirements.txt
pip installing requirements from runhouse/requirements.txt with: -r runhouse/requirements.txt
Running: /opt/conda/bin/python3.10 -m pip install -r runhouse/requirements.txt
Installing package: Package: runhouse
Installing Package: runhouse with method reqs.
reqs path: runhouse/requirements.txt
pip installing requirements from runhouse/requirements.txt with: -r runhouse/requirements.txt
Running: /opt/conda/bin/python3.10 -m pip install -r runhouse/requirements.txt

INFO | 2023-08-29 04:00:08.100045 | Time to call env_20230829_035957.install: 3.47 seconds
INFO | 2023-08-29 04:00:08.275688 | Function setup complete.

pid_git_remote(1)

INFO | 2023-08-29 04:00:12.015937 | Calling getpid.call

base servlet: Calling method call on module getpid

INFO | 2023-08-29 04:00:12.285294 | Time to call getpid.call: 0.27 seconds

Function Call Types: `.remote` and `.run`

You can use fn.remote() to have the function return a remote object, rather than the proper result. This is a convenient way to avoid passing large objects back and forth to your laptop, or to run longer execution in notebooks without locking up the kernel.

getpid_remote_obj = getpid_remote.remote()
getpid_remote_obj

INFO | 2023-08-29 04:42:17.026532 | Calling getpid.call

base servlet: Calling method call on module getpid

INFO | 2023-08-29 04:42:17.900012 | Time to call getpid.call: 0.87 seconds

<runhouse.resources.blobs.blob.Blob at 0x154dab3d0>

To retrieve the data from the returned remote object, you can call .fetch() on the remote object.

getpid_remote_obj.fetch()

INFO | 2023-08-29 04:42:18.626515 | Getting getpid_call_20230829_044209_708686
INFO | 2023-08-29 04:42:18.780105 | Time to get getpid_call_20230829_044209_708686: 0.15 seconds

To run a function async, use fn.run(), which returns a run_key that can be used to retrieve the results and logs at a later point.

getpid_run_key = getpid_remote.run()
getpid_run_key

INFO | 2023-08-29 04:42:20.182323 | Calling getpid.call
INFO | 2023-08-29 04:42:20.318719 | Time to call getpid.call: 0.14 seconds

'getpid_call_20230829_044212_868665'

To retrieve the result of the function run, you can call cluster.get() and pass in the run_key.

cluster.get(getpid_run_key)

INFO | 2023-08-29 04:42:28.747188 | Getting getpid_call_20230829_044212_868665
INFO | 2023-08-29 04:42:28.875886 | Time to get getpid_call_20230829_044212_868665: 0.13 seconds

Function Logging

`stream_logs`

To stream logs to local during the remote function call, pass in stream_logs=True to the function call.

getpid_remote(stream_logs=True)

INFO | 2023-08-29 04:43:17.812658 | Calling getpid.call

base servlet: Calling method call on module getpid

INFO | 2023-08-29 04:43:18.107531 | Time to call getpid.call: 0.29 seconds

Function logs are also automatically output onto a log file on cluster it is run on. You can refer to Runhouse Logging Docs for more information on accessing these logs.

Modules

In addition to running basic functions remotely, Runhouse lets you define classes that live and are run remotely, through the Module API.

Converting existing class to Runhouse Module

If you have an existing non-Runhouse class that you would like to run remotely, you can do so as follows:

from package import Model

RemoteModel = rh.module(cls=Model, system=cluster)
remote_model = RemoteModel(model_id="bert-base-uncased", device="cuda")
remote_model.predict("Hello world!")  # Runs on cluster

Creating your own Runhouse Module

To construct your own Runhouse Module class, simply subclass your class with rh.Module, instantiate it locally, and then send it over to your cluster.

Note that because this class is constructed locally prior to being sent over to a remote cluster, if there is a computationally heavy operation such as loading a dataset or model that we only want to be done remotely, it should be wrapped in a function and run remotely. One way of doing so is through lazy initialization, as in the data property of the module below.

# pid_module.py

import os
import runhouse as rh

class PIDModule(rh.Module):
    def __init__(self, a: int=0):
        super().__init__()
        self.a = a

    @property
    def data(self):
        if not hasattr(self, '_data'):
            self._data = load_dataset()
        return self._data

    def getpid(self):
        return os.getpid() + self.a

When working in a notebook setting, we define the class in another file, pid_module.py, and load in the module for use below. For Python scripts, the class can be defined in the same file as the script.

from pid_module import PIDModule

remote_module = PIDModule(a=5).to(cluster)
remote_module.getpid()

INFO | 2023-09-05 19:57:10.034443 | Calling PIDModule.getpid

base servlet: Calling method getpid on module PIDModule

INFO | 2023-09-05 19:57:10.308916 | Time to call PIDModule.getpid: 0.27 seconds

Env + Packages

Our sample getpid function used only builtin Python dependencies, so we did not need to worry about the function environment.

For more complex functions relying on external dependencies, Runhouse provides concepts for packages (individual dependencies/installations) and environments (group of packages or a conda env).

Package Types

Runhouse supports pip, conda, reqs and git packages, which can be constructed in the following ways.

Often times, if using Packages in the context of environments (Envs), you don’t need to construct them yourself, but can just pass in the corresponding string, and Runhouse internals will handle the conversion and installation for you.

pip_package = rh.Package.from_string("pip:numpy")
conda_package = rh.Package.from_string("conda:torch")
reqs_package = rh.Package.from_string("reqs:./")
git_package = rh.GitPackage(git_url='https://github.com/huggingface/diffusers.git',
                            install_method='pip',
                            revision='v0.11.1')

You can also send packages between local, remote, and file storage.

local_package = rh.Package.from_string("local/path/to/folder")

package_on_s3 = local_package.to(system="s3", path="/s3/path/to/folder")
package_on_cluster = local_package.to(system=cluster)

Envs

Envs, or environments, keep track of your package installs and corresponding versions. This allows for reproducible dev environments, and convenient dependency isolation and management.

Base Env

The basic Env resource just consists of a list of Packages, or strings that represent the packages. You can additionally add any environment variables by providing a Dict or .env local file path, and also set the working directory to be synced over (defaults to base GitHub repo).

env = rh.env(reqs=["numpy", reqs_package, git_package], env_vars={"USER": "*****"})

When you send an environment object to a cluster, the environment is automatically set up (packages are installed) on the cluster.

env_on_cluster = env.to(system=cluster)

INFO | 2023-08-29 04:44:06.955053 | Copying package from file:///Users/caroline/Documents/runhouse/runhouse to: cpu-cluster
INFO | 2023-08-29 04:44:08.250678 | Copying package from file:///Users/caroline/Documents/runhouse/runhouse to: cpu-cluster
INFO | 2023-08-29 04:44:09.741572 | Calling env_20230829_044402._set_env_vars

base servlet: Calling method _set_env_vars on module env_20230829_044402

INFO | 2023-08-29 04:44:10.028261 | Time to call env_20230829_044402._set_env_vars: 0.29 seconds
INFO | 2023-08-29 04:44:10.029212 | Calling env_20230829_044402.install

base servlet: Calling method install on module env_20230829_044402
Installing package: Package: numpy
Installing Package: numpy with method pip.
Running: pip install numpy
Installing package: Package: runhouse
Installing Package: runhouse with method reqs.
reqs path: runhouse/requirements.txt
pip installing requirements from runhouse/requirements.txt with: -r runhouse/requirements.txt
Running: /opt/conda/bin/python3.10 -m pip install -r runhouse/requirements.txt
Installing package: GitPackage: https://github.com/huggingface/diffusers.git@v0.11.1
Pulling: git -C ./diffusers fetch https://github.com/huggingface/diffusers.git
Checking out revision: git checkout v0.11.1
Installing GitPackage: https://github.com/huggingface/diffusers.git@v0.11.1 with method pip.
Running: pip install ./diffusers
Installing package: Package: runhouse
Installing Package: runhouse with method reqs.
reqs path: runhouse/requirements.txt
pip installing requirements from runhouse/requirements.txt with: -r runhouse/requirements.txt
Running: /opt/conda/bin/python3.10 -m pip install -r runhouse/requirements.txt

INFO | 2023-08-29 04:44:19.111342 | Time to call env_20230829_044402.install: 9.08 seconds

Conda Env

The CondaEnv resource represents a Conda environment that can be used to set up reproducible Conda envs across clusters.

There are several ways to construct a Runhouse CondaEnv object using rh.conda_env, by passing in any of the following into the conda_env parameter:

A yaml file corresponding to a conda environment config
A dict corresponding to a conda environment config
Name of an existing conda env on your local machine
Leaving the argument empty. In this case, we’ll construct a new Conda environment for you, using the list you pass into reqs.

Beyond the conda config, you can also add any additional requirements you’d like to install in the environment by adding reqs = List[packages].

# 1. config yaml file
conda_env = rh.conda_env(conda_env="conda_env.yml", reqs=["numpy", "diffusers"], name="yaml_env")
# 2. config dict
conda_dict = {"name": "conda_env", "channels": ["conda-forge"], "dependencies": ["python=3.10.0"]}
conda_env = rh.env(conda_env=conda_dict, name="dict_env")
# 3. local conda env
conda_env = rh.conda_env(conda_env="local_conda_env", name="from_local_env")
# 4. empty, construct from reqs
conda_env = rh.conda_env(reqs=["numpy", "diffusers"], name="new_env")

As with the base env, we can set up a conda env on the cluster with:

conda_env_on_cluster = conda_env.to(system=cluster)

INFO | 2023-08-29 04:48:21.600485 | Copying package from file:///Users/caroline/Documents/runhouse/runhouse to: cpu-cluster
INFO | 2023-08-29 04:48:23.132095 | Calling new_env.install

new_env servlet: Calling method install on module new_env
Env already installed, skipping

INFO | 2023-08-29 04:48:24.358608 | Time to call new_env.install: 1.23 seconds

Previously in the cluster section, we mentioned several cluster APIs such as running CLI or Python commands. These all run on the base environment in the examples above, but now that we’ve defined a Conda env, let’s demonstrate how we can accomplish this inside a Conda env on the cluster:

# run Python command within the conda env
cluster.run_python(['import numpy', 'print(numpy.__version__)'], env=conda_env)

INFO | 2023-08-29 05:14:08.725396 | Running command on cpu-cluster: conda run -n new_env python3 -c "import numpy; print(numpy.__version__)"

1.25.2

[(0, '1.25.2nn', '')]

# install additional package on given env
cluster.install_packages(["pandas"], env=conda_env)

Putting it all together – Cluster, Function/Module, Env

Now that we understand how clusters, functions, and packages/environments work, we can go on to implement more complex functions that require external dependencies, and seamlessly run them on a remote cluster.

def add_lists(list_a, list_b):
  import numpy as np

  return np.add(np.array(list_a), np.array(list_b))

Note that in the function defined, we include the import statement import numpy as np within the function. The import needs to be inside the function definition in notebook or interactive environments, but can be outside the function if being used in a Python script.

env = rh.env(reqs=["numpy"])
add_lists_remote = rh.function(fn=add_lists).to(system=cluster, env=env)

list_a = [1, 2, 3]
list_b = [2, 3, 4]
add_lists_remote(list_a, list_b)

INFO | 2023-08-29 05:20:27.959315 | Writing out function function to /Users/caroline/Documents/runhouse/runhouse/docs/notebooks/basics/add_lists_fn.py. Please make sure the function does not rely on any local variables, including imports (which should be moved inside the function body).
INFO | 2023-08-29 05:20:27.962973 | Setting up Function on cluster.
INFO | 2023-08-29 05:20:27.965670 | Copying package from file:///Users/caroline/Documents/runhouse/runhouse to: cpu-cluster
INFO | 2023-08-29 05:20:29.406978 | Calling env_20230829_052021.install

base servlet: Calling method install on module env_20230829_052021
Installing package: Package: numpy
Installing Package: numpy with method pip.
Running: pip install numpy
Installing package: Package: runhouse
Installing Package: runhouse with method reqs.
reqs path: runhouse/requirements.txt
pip installing requirements from runhouse/requirements.txt with: -r runhouse/requirements.txt
Running: /opt/conda/bin/python3.10 -m pip install -r runhouse/requirements.txt

INFO | 2023-08-29 05:20:32.575986 | Time to call env_20230829_052021.install: 3.17 seconds
INFO | 2023-08-29 05:20:32.774676 | Function setup complete.
INFO | 2023-08-29 05:20:32.791597 | Calling add_lists.call

base servlet: Calling method call on module add_lists

INFO | 2023-08-29 05:20:33.086075 | Time to call add_lists.call: 0.29 seconds

array([3, 5, 7])

Now that you understand the basics, feel free to play around with more complicated scenarios! You can also check out our additional API and example usage tutorials on our docs site.

Cluster Termination

To terminate the cluster, you can call sky down cluster-name in CLI or cluster_obj.teardown() in Python.

!sky down cpu-cluster
# or
cluster.teardown()

Previous
API Tutorials

Next
Data: Folders, Tables, & Blobs