In this example, we show you how to start a basic hyperparameter tuning using Ray Tune on remote compute. You simply need to write your Ray Tune program as you would normally, and then send it to the remote cluster. Kubetorch handles all the complexities of launching and setting up the remote Ray cluster for you.
import time from typing import Any, Dict import kubetorch as kt from ray import tune
We define a Trainable class and a find_minumum function to demonstrate a basic example of using Ray Tune for hyperparameter optimization. You should simply think of this as "any regular Ray Tune program" that you would write entirely agnostic of Kubetorch.
def train_fn(step, width, height): time.sleep(5) return (0.1 + width * step / 100) ** (-1) + height * 0.1 class Trainable(tune.Trainable): def setup(self, config: Dict[str, Any]): self.step_num = 0 self.reset_config(config) def reset_config(self, new_config: Dict[str, Any]): self._config = new_config return True def step(self): score = train_fn(self.step_num, **self._config) self.step_num += 1 return {"score": score} def cleanup(self): super().cleanup() def load_checkpoint(self, checkpoint_dir: str): return None def save_checkpoint(self, checkpoint_dir: str): return None def find_minimum(num_concurrent_trials=None, num_samples=1, metric_name="score"): search_space = { "width": tune.uniform(0, 20), "height": tune.uniform(-100, 100), } tuner = tune.Tuner( Trainable, tune_config=tune.TuneConfig( metric=metric_name, mode="max", max_concurrent_trials=num_concurrent_trials, num_samples=num_samples, reuse_actors=True, ), param_space=search_space, ) tuner.fit() return tuner.get_results().get_best_result()
We will now dispatch the program, set up Ray, and run the hyperparameter optimization on the remote compute.
if __name__ == "__main__": head = kt.Compute(num_cpus=4, image=kt.images.ray()) remote_find_minimum = kt.fn(find_minimum).to(head).distribute("ray", num_nodes=2) best_result = remote_find_minimum(num_samples=8)