A Module represents a class that can be sent to and used on remote clusters and environments. Modules can live on remote hardware and its class methods called remotely.
Returns a Module object, which can be used to instantiate and interact with the class remotely.
Any callable public method of the module is intercepted and executed remotely over rpc, with exception of
certain functions Python doesn’t make interceptable (e.g. __call__, __init__), and methods of the Module
class (e.g. to
, fetch
, etc.). Properties and private methods are not intercepted, and will be
executed locally.
Any method which executes remotely may be called normally, e.g. model.forward(x)
, or asynchronously,
e.g. key = model.forward.run(x)
(which returns a key to retrieve the result with
cluster.get(key)
), or with run_obj = model.train.remote(x)
, which runs synchronously but returns
a remote object to avoid passing heavy results back over the network.
Setting attributes, both public and private, will be executed remotely, with the new values only being
set in the remote module and not the local one. This excludes any methods or attribtes of the Module class
proper (e.g. system
or name
), which will be set locally.
Attributes, private properties can be fetched with the remote
property, and the full resource can be
fetched using .fetch()
, e.g. model.remote.weights
, model.remote.__dict__
, model.fetch()
.
When a module is sent to a cluster, it’s public attribtes are serialized, sent over, and repopulated in the remote instance. This means that any changes to the module’s attributes will not be reflected in the remote
cls – The class to instantiate.
name (Optional[str], optional) – Name to give the module object, to be reused later on. (Default: None
)
load_from_den (bool, optional) – Whether to try loading the module from Den. (Default: True
)
dryrun (bool, optional) – Whether to create the Module if it doesn’t exist, or load a Module object as a dryrun.
(Default: False
)
The resulting module.
>>> import runhouse as rh >>> import transformers >>> >>> # Sample rh.Module class >>> class Model(rh.Module): >>> def __init__(self, model_id, device="cpu"): >>> # Note that the code here will be run in your local environment prior to being sent to >>> # to a cluster. For loading large models/datasets that are only meant to be used remotely, >>> # we recommend using lazy initialization (see tokenizer and model attributes below). >>> super().__init__() >>> self.model_id = model_id >>> self.device = device >>> >>> @property >>> def tokenizer(self): >>> # Lazily initialize the tokenizer remotely only when it is needed >>> if not hasattr(self, '_tokenizer'): >>> self._tokenizer = transformers.AutoTokenizer.from_pretrained(self.model_id) >>> return self._tokenizer >>> >>> @property >>> def model(self): >>> if not hasattr(self, '_model'): >>> self._model = transformers.AutoModel.from_pretrained(self.model_id).to(self.device) >>> return self._model >>> >>> def predict(self, x): >>> x = self.tokenizer(x, return_tensors="pt") >>> return self.model(x)
>>> # Creating rh.Module instance >>> model = Model(model_id="bert-base-uncased", device="cuda") >>> model = model.to(system="my_gpu") >>> model.predict("Hello world!") # Runs on system in process >>> tok = model.remote.tokenizer # Returns remote tokenizer >>> id = model.local.model_id # Returns local model_id, if any >>> model_id = model.model_id # Returns local model_id (not remote) >>> model.fetch() # Returns full remote module, including model and tokenizer >>>
>>> other_model = Model(model_id="bert-base-uncased", device="cuda").to("my_gpu", "my_process") >>> >>> # Another method: Create a module instance from an existing non-Module class using rh.module() >>> RemoteModel = rh.module(cls=BERTModel).to("my_gpu", "my_process") >>> remote_model = RemoteModel(model_id="bert-base-uncased", device="cuda").to(system="my_gpu") >>> remote_model.predict("Hello world!") # Runs on system in process >>> >>> # You can also call remote class methods >>> other_model = RemoteModel.get_model_size("bert-base-uncased")
>>> # Loading a module >>> my_local_module = rh.module(name="~/my_module") >>> my_s3_module = rh.module(name="@/my_module")
Distribute the module on the cluster and return the distributed module.
distribution (str) – The distribution method to use, e.g. “pool”, “ray”, “pytorch”, or “tensorflow”.
name (str, optional) – The name to give to the distributed module, if applicable. Overwrites current module name by default. (Default: None
)
num_replicas (int, optional) – The number of replicas to create. (Default: 1)
replicas_per_node (int, optional) – The number of replicas to create per node. (Default: None
)
replication_kwargs – The keyword arguments to pass to the replicate method.
distribution_kwargs – The keyword arguments to pass to the distribution method.
The endpoint of the module on the cluster. Returns an endpoint if one was manually set (e.g. if loaded down from a config). If not, request the endpoint from the Module’s system.
external (bool, optional) – If True and getting the endpoint from the system, only return an endpoint if
it’s externally accessible (i.e. not on localhost, not connected through as ssh tunnel). If False,
return the endpoint even if it’s not externally accessible. (Default: False
)
Helper method to allow for access to remote state, both public and private. Fetching functions is not advised. system.get(module.name).resolved_state() is roughly equivalent to module.fetch().
Example
>>> my_module.fetch("my_property") >>> my_module.fetch("my_private_property")
>>> MyRemoteClass = rh.module(my_class).to(system) >>> MyRemoteClass(*args).fetch() # Returns a my_class instance, populated with the remote state
>>> my_module.fetch() # Returns the data of the blob, due to overloaded ``resolved_state`` method
>>> class MyModule(rh.Module): >>> # ... >>> >>> MyModule(*args).to(system).fetch() # Returns the full remote module, including private and public state
Async version of fetch. Can’t be a property like fetch because __getattr__ can’t be awaited.
Example
>>> await my_module.fetch_async("my_property") >>> await my_module.fetch_async("_my_private_property")
Load or construct resource from config.
config (Dict) – Resource config.
dryrun (bool, optional) – Whether to construct resource or load as dryrun (Default: False
)
Check if the module already exists on the cluster, and if so return the module object. If not, put the module on the cluster and return the remote module.
system (str or Cluster) – The system to setup the module.
process (str or Dict, optional) – The process to run the module on, if it’s a Dict, it will be explicitly created with those args.
or the set of requirements necessary to run the module. (Default: None
)
name (Optional[str], optional) – Name to give to the module resource, if you wish to rename it.
(Default: None
)
Example
>>> remote_df = Model().get_or_to(my_cluster, name="remote_model")
Helper property to allow for access to local properties, both public and private.
Example
>>> my_module.local.my_property >>> my_module.local._my_private_property
>>> my_module.local.size = 14
Method signature, consisting of method properties to preserve when sending the method over the wire.
Generate an OpenAPI spec for the module.
spec_name (str, optional) – Spec name for the OpenAPI spec.
Update the resource in the object store.
Helper property to allow for access to remote properties, both public and private. Returning functions is not advised.
Example
>>> my_module.remote.my_property >>> my_module.remote._my_private_property >>> my_module.remote.size = 14
Rename the module.
name (str) – Name to rename the module to.
Replicate the module on the cluster in new processes and return the new modules.
num_relicas (int, optional) – Number of replicas of the module to create. (Default: 1)
names (List[str], optional) – List for the names for the replicas, if specified. (Default: None
)
processes (List[Process], optional) – List of the processes for the replicas, if specified. (Default: None
)
parallel (bool, optional) – Whether to create the replicas in parallel. (Default: False
)
Specify that the module should resolve to a particular state when passed into a remote method. This is
useful if you want to revert the module’s state to some “Runhouse-free” state once it is passed into a
Runhouse-unaware function. For example, if you call a Runhouse-unaware function with .remote()
,
you will be returned a module which wraps your data. If you want to pass that module into another function
that operates on the original data (e.g. a function that takes a numpy array), you can call
my_second_fn(my_func.resolve())
, and my_func
will be replaced with the contents of its .data
on the
cluster before being passed into my_second_fn
.
Resolved state is defined by the resolved_state
method. By default, modules created with the
rh.module
factory constructor will be resolved to their original non-module-wrapped class (or best attempt).
Modules which are defined as a subclass of Module
will be returned as-is, as they have no other
“original class.”
Example
>>> my_module = rh.module(my_class) >>> my_remote_fn(my_module.resolve()) # my_module will be replaced with the original class `my_class`
>>> my_result_module = my_remote_fn.call.remote(args) >>> my_other_remote_fn(my_result_module.resolve()) # my_result_module will be replaced with its data
Return the resolved state of the module. By default, this is the original class of the module if it was
created with the module
factory constructor.
Register the resource, saving it to the Den config store. Uses the resource’s self.config() to generate the dict to save.
Async version of property setter.
Example
>>> await my_module.set_async("my_property", my_value) >>> await my_module.set_async("_my_private_property", my_value)
Grant access to the resource for a list of users (or a single user). By default, the user will
receive an email notification of access (if they have a Runhouse account) or instructions on creating
an account to access the resource. If visibility
is set to public
, users will not be notified.
Note
You can only grant access to other users if you have write access to the resource.
users (Union[str, list], optional) – Single user or list of user emails and / or runhouse account usernames.
If none are provided and visibility
is set to public
, resource will be made publicly
available to all users. (Default: None
)
access_level (ResourceAccess
, optional) – Access level to provide for the resource.
(Default: read
).
visibility (ResourceVisibility
, optional) – Type of visibility to provide for the shared
resource. By default, the visibility is private. (Default: None
)
notify_users (bool, optional) – Whether to send an email notification to users who have been given access.
Note: This is relevant for resources which are not shareable
. (Default: True
)
headers (Dict, optional) – Request headers to provide for the request to Den. Contains the user’s auth token.
Example: {"Authorization": f"Bearer {token}"}
Users who already have a Runhouse account and have been granted access to the resource.
Users who do not have Runhouse accounts and received notifications via their emails.
Set of valid usernames and emails from users
parameter.
Tuple(Dict, Dict, Set)
Example
>>> # Write access to the resource for these specific users. >>> # Visibility will be set to private (users can search for and view resource in Den dashboard) >>> my_resource.share(users=["username1", "user2@gmail.com"], access_level='write')
>>> # Make resource public, with read access to the resource for all users >>> my_resource.share(visibility='public')
Put a copy of the module on the destination system and process, and return the new module.
system (str or Cluster) – The system to setup the module and process on.
process (str or Dict, optional) – The process to run the module on, if it’s a Dict, it will be explicitly created with those args.
or the set of requirements necessary to run the module. (Default: None
)
name (Optional[str], optional) – Name to give to the module resource, if you wish to rename it.
(Default: None
)
Example
>>> local_module = rh.module(my_class) >>> cluster_module = local_module.to("my_cluster")