AI Systems that Keep Learning

Mobilize your first-party data and iteratively improve your AI applications via continuous refinement and fine-tuning of your AI models. Runhouse is a scalable, cost-effective, and fault-tolerant platform housed inside your own cloud.

Book an Intro

Example

AI Applications Depend on Good AI Models

AI systems rely on one or more ML models used in concert to generate a desired result. These models should be tuned for your production apps with Runhouse based on first-party data and user interaction traces.

For RAG, fine-tune your embedding models to become domain-aware.
For chat, use reinforcement learning from human feedback to guide outputs.
For agentic actions, tune extremely fast and accurate small language models.

Runhouse makes it extremely easy for any data scientist or engineer to deploy training and inference code written in normal Python onto your cloud compute.

Let's Chat!

Flexible and Simple for Your Engineers

Runhouse is the easiest way to just go faster with no additional infrastructure overhead or platforms team lift.

Scale from single GPU to distributed training with a single line of code.
Manage compute and execution with full visibility and control, including direct SSH access into the distributed clusters.
Full reproducibility and fault tolerance for automation.

And our team of ML experts can help guide you to the best research methods and approaches.

Get Started

Runhouse Github

Github

AI Is About Winning with Data

Training over first-party data is nothing new. Banks customize their fraud models rather than trusting external models; TikTok trains its recommender models every 15 minutes; OpenAI itself productized GPT3 by using reinforcement learning from human feedback. Production AI systems require the same iterative feedback loops to steadily improve accuracy and quality over time.

Runhouse is a flexible framework for model training and fine-tuning.

Scale Up Easily, Work at Any Scale

Runhouse is the easiest way to start and robustly execute distributed trainings. Scale up with just one line of code and

import runhouse as rh

# Write a regular Python class, like a 
# Llama model trainer and send it to your cluster
from llama70b_tune import Trainer
remote_trainer_class = rh.module(Trainer).to(
    cluster
)

# Instantiate a remote instance and call 
# .distribute() to setup distributed training
remote_trainer = remote_trainer_class().distribute(
    distribution="pytorch",
    replicas_per_node=gpus_per_node,
    num_replicas=gpus_per_node * num_nodes,
)

Launched within Your Cloud

Runhouse works with your cloud compute, and makes it simple to use. Developers should focus on methods, but not wrangle complex infrastructure to launch training.

# Launch from elastic compute
aws_secret = rh.provider_secret("aws") 
lambda_secret = rh.provider_secret("lambda", 
     values={"api_key": "lambda_key"}) 

# Existing Kubernetes clusters
kube_config = rh.provider_secret(provider="kubernetes", 
     path="~/.kube/config")

# Or VMs
ssh_secret = rh.provider_secret(provider="ssh", 
     name="on_prem_compute")

# Launch a multinode cluster defined in code
gpu_cluster = rh.cluster(
    name=f"rh-{num_nodes}x{gpus_per_node}-gpu",
    gpus=f"A100:{gpus_per_node}",
    num_nodes=num_nodes,
    use_spot=True, # Can use spot instances easily
    provider=provider,
    image=img,
).up_if_not()

import runhouse as rh

# Write a regular Python class, like a 
# Llama model trainer and send it to your cluster
from llama70b_tune import Trainer
remote_trainer_class = rh.module(Trainer).to(
    cluster
)

# Instantiate a remote instance and call 
# .distribute() to setup distributed training
remote_trainer = remote_trainer_class().distribute(
    distribution="pytorch",
    replicas_per_node=gpus_per_node,
    num_replicas=gpus_per_node * num_nodes,
)

Everything you need to get started with Runhouse today.

See an Example

Learn more about the technical details of Runhouse and try implementing the open-source package into your existing Python code. Here's an example of how to deploy Llama3 to EC2 in just a few lines.

See an Example

Talk to Donny (our founder)

We've been building ML platforms and open-source libraries like PyTorch for over a decade. We'd love to chat and get your feedback!

Book Time

Get in touch 👋

Whether you'd like to learn more about Runhouse or need a little assistance trying out the product, we're here to help.

Email

team@run.house

Connect with us

AI Systems that Keep Learning

AI Applications Depend on Good AI Models

Flexible and Simple for Your Engineers

Runhouse Github

Lora Fine Tuning

DeepSeek Inference

Domain-Aware Embeddings

Ray Data Preprocessing

AI Is About Winning with Data

Runhouse is a flexible framework for model training and fine-tuning.

Scale Up Easily, Work at Any Scale

Launched within Your Cloud

Everything you need to get started with Runhouse today.

See an Example

Talk to Donny (our founder)

Get in touch 👋

Email

Linkedin

X / Twitter

AI Systems that Keep Learning

AI Applications Depend on Good AI Models

Flexible and Simple for Your Engineers

Runhouse Github

Lora Fine Tuning

DeepSeek Inference

Domain-Aware Embeddings

Ray Data Preprocessing

AI Is About Winning with Data

Runhouse is a flexible framework for model training and fine-tuning.

Scale Up Easily, Work at Any Scale

Launched within Your Cloud

Everything you need to get started with Runhouse today.

See an Example

Talk to Donny (our founder)

Get in touch 👋

Email

Linkedin

X / Twitter

Everything you need to get started with Runhouse today.