You are viewing main version. Click here to see docs for the latest stable version.

Module

A Module represents a class that can be sent to and used on remote clusters and environments. Modules can live on remote hardware and its class methods called remotely.

Module Factory Method

runhouse.module(cls: [Type] = None, name: str | None = None, load_from_den: bool = True, dryrun: bool = False)[source]

Returns a Module object, which can be used to instantiate and interact with the class remotely.

The behavior of Modules (and subclasses thereof) is as follows:
  • Any callable public method of the module is intercepted and executed remotely over rpc, with exception of certain functions Python doesn’t make interceptable (e.g. __call__, __init__), and methods of the Module class (e.g. to, fetch, etc.). Properties and private methods are not intercepted, and will be executed locally.

  • Any method which executes remotely may be called normally, e.g. model.forward(x), or asynchronously, e.g. key = model.forward.run(x) (which returns a key to retrieve the result with cluster.get(key)), or with run_obj = model.train.remote(x), which runs synchronously but returns a remote object to avoid passing heavy results back over the network.

  • Setting attributes, both public and private, will be executed remotely, with the new values only being set in the remote module and not the local one. This excludes any methods or attribtes of the Module class proper (e.g. system or name), which will be set locally.

  • Attributes, private properties can be fetched with the remote property, and the full resource can be fetched using .fetch(), e.g. model.remote.weights, model.remote.__dict__, model.fetch().

  • When a module is sent to a cluster, it’s public attribtes are serialized, sent over, and repopulated in the remote instance. This means that any changes to the module’s attributes will not be reflected in the remote

Parameters:
  • cls – The class to instantiate.

  • name (Optional[str], optional) – Name to give the module object, to be reused later on. (Default: None)

  • load_from_den (bool, optional) – Whether to try loading the module from Den. (Default: True)

  • dryrun (bool, optional) – Whether to create the Module if it doesn’t exist, or load a Module object as a dryrun. (Default: False)

Returns:

The resulting module.

Return type:

Module

Example - creating a module by defining an rh.Module subclass:
>>> import runhouse as rh >>> import transformers >>> >>> # Sample rh.Module class >>> class Model(rh.Module): >>> def __init__(self, model_id, device="cpu"): >>> # Note that the code here will be run in your local environment prior to being sent to >>> # to a cluster. For loading large models/datasets that are only meant to be used remotely, >>> # we recommend using lazy initialization (see tokenizer and model attributes below). >>> super().__init__() >>> self.model_id = model_id >>> self.device = device >>> >>> @property >>> def tokenizer(self): >>> # Lazily initialize the tokenizer remotely only when it is needed >>> if not hasattr(self, '_tokenizer'): >>> self._tokenizer = transformers.AutoTokenizer.from_pretrained(self.model_id) >>> return self._tokenizer >>> >>> @property >>> def model(self): >>> if not hasattr(self, '_model'): >>> self._model = transformers.AutoModel.from_pretrained(self.model_id).to(self.device) >>> return self._model >>> >>> def predict(self, x): >>> x = self.tokenizer(x, return_tensors="pt") >>> return self.model(x)
>>> # Creating rh.Module instance >>> model = Model(model_id="bert-base-uncased", device="cuda") >>> model = model.to(system="my_gpu") >>> model.predict("Hello world!") # Runs on system in process >>> tok = model.remote.tokenizer # Returns remote tokenizer >>> id = model.local.model_id # Returns local model_id, if any >>> model_id = model.model_id # Returns local model_id (not remote) >>> model.fetch() # Returns full remote module, including model and tokenizer >>>
Example - creating a Module from an existing class, via the rh.module() factory method:
>>> other_model = Model(model_id="bert-base-uncased", device="cuda").to("my_gpu", "my_process") >>> >>> # Another method: Create a module instance from an existing non-Module class using rh.module() >>> RemoteModel = rh.module(cls=BERTModel).to("my_gpu", "my_process") >>> remote_model = RemoteModel(model_id="bert-base-uncased", device="cuda").to(system="my_gpu") >>> remote_model.predict("Hello world!") # Runs on system in process >>> >>> # You can also call remote class methods >>> other_model = RemoteModel.get_model_size("bert-base-uncased")
>>> # Loading a module >>> my_local_module = rh.module(name="~/my_module") >>> my_s3_module = rh.module(name="@/my_module")

Module Class

class runhouse.Module(pointers: Tuple | None = None, signature: dict | None = None, endpoint: str | None = None, name: str | None = None, system: str | Cluster | None = None, dryrun: bool = False, **kwargs)[source]
__init__(pointers: Tuple | None = None, signature: dict | None = None, endpoint: str | None = None, name: str | None = None, system: str | Cluster | None = None, dryrun: bool = False, **kwargs)[source]

Runhouse Module object.

Note

To create a Module, please use the factory method module().

distribute(distribution: str, name: str | None = None, num_replicas: int | None = 1, replicas_per_node: int | None = None, replication_kwargs: dict | None = {}, **distribution_kwargs)[source]

Distribute the module on the cluster and return the distributed module.

Parameters:
  • distribution (str) – The distribution method to use, e.g. “pool”, “ray”, “pytorch”, or “tensorflow”.

  • name (str, optional) – The name to give to the distributed module, if applicable. Overwrites current module name by default. (Default: None)

  • num_replicas (int, optional) – The number of replicas to create. (Default: 1)

  • replicas_per_node (int, optional) – The number of replicas to create per node. (Default: None)

  • replication_kwargs – The keyword arguments to pass to the replicate method.

  • distribution_kwargs – The keyword arguments to pass to the distribution method.

endpoint(external: bool = False)[source]

The endpoint of the module on the cluster. Returns an endpoint if one was manually set (e.g. if loaded down from a config). If not, request the endpoint from the Module’s system.

Parameters:

external (bool, optional) – If True and getting the endpoint from the system, only return an endpoint if it’s externally accessible (i.e. not on localhost, not connected through as ssh tunnel). If False, return the endpoint even if it’s not externally accessible. (Default: False)

fetch(item: str | None = None, **kwargs)[source]

Helper method to allow for access to remote state, both public and private. Fetching functions is not advised. system.get(module.name).resolved_state() is roughly equivalent to module.fetch().

Example

>>> my_module.fetch("my_property") >>> my_module.fetch("my_private_property")
>>> MyRemoteClass = rh.module(my_class).to(system) >>> MyRemoteClass(*args).fetch() # Returns a my_class instance, populated with the remote state
>>> my_module.fetch() # Returns the data of the blob, due to overloaded ``resolved_state`` method
>>> class MyModule(rh.Module): >>> # ... >>> >>> MyModule(*args).to(system).fetch() # Returns the full remote module, including private and public state
async fetch_async(key: str, remote: bool = False, stream_logs: bool = False)[source]

Async version of fetch. Can’t be a property like fetch because __getattr__ can’t be awaited.

Example

>>> await my_module.fetch_async("my_property") >>> await my_module.fetch_async("_my_private_property")
classmethod from_config(config: Dict, dryrun: bool = False, _resolve_children: bool = True)[source]

Load or construct resource from config.

Parameters:
  • config (Dict) – Resource config.

  • dryrun (bool, optional) – Whether to construct resource or load as dryrun (Default: False)

get_or_to(system: str | Cluster, process: str | Dict | None = None, name: str | None = None)[source]

Check if the module already exists on the cluster, and if so return the module object. If not, put the module on the cluster and return the remote module.

Parameters:
  • system (str or Cluster) – The system to setup the module.

  • process (str or Dict, optional) – The process to run the module on, if it’s a Dict, it will be explicitly created with those args. or the set of requirements necessary to run the module. (Default: None)

  • name (Optional[str], optional) – Name to give to the module resource, if you wish to rename it. (Default: None)

Example

>>> remote_df = Model().get_or_to(my_cluster, name="remote_model")
property local

Helper property to allow for access to local properties, both public and private.

Example

>>> my_module.local.my_property >>> my_module.local._my_private_property
>>> my_module.local.size = 14
method_signature(method)[source]

Method signature, consisting of method properties to preserve when sending the method over the wire.

openapi_spec(spec_name: str | None = None)[source]

Generate an OpenAPI spec for the module.

Parameters:

spec_name (str, optional) – Spec name for the OpenAPI spec.

refresh()[source]

Update the resource in the object store.

property remote

Helper property to allow for access to remote properties, both public and private. Returning functions is not advised.

Example

>>> my_module.remote.my_property >>> my_module.remote._my_private_property >>> my_module.remote.size = 14
rename(name: str)[source]

Rename the module.

Parameters:

name (str) – Name to rename the module to.

replicate(num_replicas: int = 1, replicas_per_node: int | None = None, names: List[str] = None, processes: List[Process] | None = None, parallel: bool = False)[source]

Replicate the module on the cluster in new processes and return the new modules.

Parameters:
  • num_relicas (int, optional) – Number of replicas of the module to create. (Default: 1)

  • names (List[str], optional) – List for the names for the replicas, if specified. (Default: None)

  • processes (List[Process], optional) – List of the processes for the replicas, if specified. (Default: None)

  • parallel (bool, optional) – Whether to create the replicas in parallel. (Default: False)

resolve()[source]

Specify that the module should resolve to a particular state when passed into a remote method. This is useful if you want to revert the module’s state to some “Runhouse-free” state once it is passed into a Runhouse-unaware function. For example, if you call a Runhouse-unaware function with .remote(), you will be returned a module which wraps your data. If you want to pass that module into another function that operates on the original data (e.g. a function that takes a numpy array), you can call my_second_fn(my_func.resolve()), and my_func will be replaced with the contents of its .data on the cluster before being passed into my_second_fn.

Resolved state is defined by the resolved_state method. By default, modules created with the rh.module factory constructor will be resolved to their original non-module-wrapped class (or best attempt). Modules which are defined as a subclass of Module will be returned as-is, as they have no other “original class.”

Example

>>> my_module = rh.module(my_class) >>> my_remote_fn(my_module.resolve()) # my_module will be replaced with the original class `my_class`
>>> my_result_module = my_remote_fn.call.remote(args) >>> my_other_remote_fn(my_result_module.resolve()) # my_result_module will be replaced with its data
resolved_state()[source]

Return the resolved state of the module. By default, this is the original class of the module if it was created with the module factory constructor.

save(name: str | None = None, overwrite: bool = True, folder: str | None = None)[source]

Register the resource, saving it to the Den config store. Uses the resource’s self.config() to generate the dict to save.

async set_async(key: str, value)[source]

Async version of property setter.

Example

>>> await my_module.set_async("my_property", my_value) >>> await my_module.set_async("_my_private_property", my_value)
share(*args, visibility=None, **kwargs)[source]

Grant access to the resource for a list of users (or a single user). By default, the user will receive an email notification of access (if they have a Runhouse account) or instructions on creating an account to access the resource. If visibility is set to public, users will not be notified.

Note

You can only grant access to other users if you have write access to the resource.

Parameters:
  • users (Union[str, list], optional) – Single user or list of user emails and / or runhouse account usernames. If none are provided and visibility is set to public, resource will be made publicly available to all users. (Default: None)

  • access_level (ResourceAccess, optional) – Access level to provide for the resource. (Default: read).

  • visibility (ResourceVisibility, optional) – Type of visibility to provide for the shared resource. By default, the visibility is private. (Default: None)

  • notify_users (bool, optional) – Whether to send an email notification to users who have been given access. Note: This is relevant for resources which are not shareable. (Default: True)

  • headers (Dict, optional) – Request headers to provide for the request to Den. Contains the user’s auth token. Example: {"Authorization": f"Bearer {token}"}

Returns:

added_users:

Users who already have a Runhouse account and have been granted access to the resource.

new_users:

Users who do not have Runhouse accounts and received notifications via their emails.

valid_users:

Set of valid usernames and emails from users parameter.

Return type:

Tuple(Dict, Dict, Set)

Example

>>> # Write access to the resource for these specific users. >>> # Visibility will be set to private (users can search for and view resource in Den dashboard) >>> my_resource.share(users=["username1", "user2@gmail.com"], access_level='write')
>>> # Make resource public, with read access to the resource for all users >>> my_resource.share(visibility='public')
to(system: str | Cluster, process: str | Dict | None = None, name: str | None = None, sync_local: bool = True)[source]

Send the module to a specified process on a cluster. This will sync over relevant code for the module to run on the cluster, and return a remote_module object that will wrap remote calls to the module living on the cluster.

Parameters:
  • system (str or Cluster) – The cluster to setup the module and process on.

  • process (str or Dict, optional) – The process to run the module on. If it’s a Dict, it will be explicitly created with those args. or the set of requirements necessary to run the module. If no process is specified, the module will be sent to the default_process created when the cluster is created (Default: None)

  • name (Optional[str], optional) – Name to give to the module resource, if you wish to rename it. (Default: None)

  • sync_local (bool, optional) – Whether to sync up and use the local module on the cluster. If False, don’t sync up and use the equivalent module found on the cluster. (Default: True)

Example

>>> local_module = rh.module(my_class) >>> cluster_module = local_module.to("my_cluster")