StableDiffusion XL: How to Host Your Own Image Generation AI
Discover the complete guide about how to host your own image generation AI with StableDiffusion. This comprehensive blog post walks you through the steps of setting up StableDiffusion XL on AWS. Learn about the hardware requirements, software dependencies, and configuration process needed to launch your own image AI.
Paul Yang
ML @ 🏃♀️Runhouse🏠
July 29, 2024
Setup
First, we need to setup locally to make sure we can deploy the model to AWS. You will need:
- AWS credentials with permission to launch a cluster (You can also use GCP or any other cloud of your choice)
- A Hugging Face token, which lets you download the model
$ pip install "runhouse[aws]" Pillow $ aws configure $ sky check $ export HF_TOKEN=<your huggingface token>
Code
The following code is short and simple, and will deploy StableDiffusion to an AWS cloud machine.
- The g5.8xlarge we specify here is ~$2.4/hour to run, which makes it acceptable for research purposes.
- We first define a model class, which will download the model, load it into memory, and do inference.
- Then we send that model class to remote compute using `.get_or_to()` which makes the code callable locally but run on remote. The first time this script is run, it will take a few minutes as the model needs to download first.
- Then we can call the `.generate()` and show the image on local, even though the execution was done on the remote cluster.
import base64 import os from io import BytesIO import runhouse as rh from PIL import Image # Define a class that will hold the model and allow us to send prompts to it. class StableDiffusionXLPipeline(rh.Module): def __init__( self, model_id: str = "stabilityai/stable-diffusion-xl-base-1.0", model_dir: str = "sdxl", ): super().__init__() self.model_dir = model_dir self.model_id = model_id self.pipeline = None def _model_loaded_on_disk(self): return ( self.model_dir and os.path.isdir(self.model_dir) and len(os.listdir(self.model_dir)) > 0 ) def _load_pipeline(self): import torch from diffusers import DiffusionPipeline from huggingface_hub import snapshot_download if not self._model_loaded_on_disk(): # save compiled model to local directory # Downloads our compiled model from the HuggingFace Hub # and makes sure we exclude the symlink files and "hidden" files, like .DS_Store, .gitignore, etc. snapshot_download( self.model_id, local_dir=self.model_dir, local_dir_use_symlinks=False, allow_patterns=["[!.]*.*"], ) # load local converted model into pipeline self.pipeline = DiffusionPipeline.from_pretrained( self.model_dir, device_ids=[0, 1], torch_dtype=torch.float16 ) self.pipeline.to("cuda") def generate(self, input_prompt: str, output_format: str = "JPEG", **parameters): # extract prompt from data if not self.pipeline: self._load_pipeline() generated_images = self.pipeline(input_prompt, **parameters)["images"] # postprocess convert image into base64 string encoded_images = [] for image in generated_images: buffered = BytesIO() image.save(buffered, format=output_format) encoded_images.append(base64.b64encode(buffered.getvalue()).decode()) # always return the first return encoded_images def decode_base64_image(image_string): base64_image = base64.b64decode(image_string) buffer = BytesIO(base64_image) return Image.open(buffer) ## Now, we define the main function that will run locally when we run this script, and set up our Runhouse module on a remote cluster. #First, we create a cluster with the desired instance type and provider. Our `instance_type` here is `g5.8xlarge`, which is an AWS instance type costing $2.4/hr on demand as of 7/29/2024. if __name__ == "__main__": cluster = rh.cluster( name="rh-g5", instance_type="g5.8xlarge", provider="aws", ).up_if_not() # Next, we define the environment for our module. This includes the required dependencies that need to be installed on the remote machine, as well as any secrets that need to be synced up from local to remote. Passing `huggingface` to the `secrets` parameter will load the Hugging Face token we set up earlier. env = rh.env( name="sdxl_inference", reqs=[ "diffusers==0.21.4", "huggingface_hub", "torch", "transformers==4.31.0", "accelerate==0.21.0", ], secrets=["huggingface"], # Needed to download model ) # Finally, we define our module and run it on the remote cluster. We construct it normally and then call `get_or_to` to run it on the remote cluster. Using `get_or_to` allows us to load the existing Module by the name `sdxl` if it was already put on the cluster. If we want to update the module each time we run this script, we can use `to` instead of `get_or_to`. model = StableDiffusionXLPipeline().get_or_to(cluster, env=env, name="sdxl") # We can call the `generate` method on the model class instance if it were running locally. prompt = "A woman runs through a large, grassy field towards a house." response = model.generate( prompt, num_inference_steps=25, negative_prompt="disfigured, ugly, deformed", ) for gen_img in response: img = decode_base64_image(gen_img) img.show()
Stay up to speed 🏃♀️📩
Subscribe to our newsletter to receive updates about upcoming Runhouse features and announcements.
Read More
How to Deploy Llama 3.1 To Your Own Infrastructure (AWS Example, Released July 2024)
July 23, 2024