How to implement Stable Diffusion webUI on E2E Cloud?
Stable Diffusion is a milestone in Generative Models serving the masses with the quality of images produced, its speed and relatively low computation/memory resources requirement. In this post we are going to get an overview of Stable Diffusion and the steps required to implement webUI on E2E Cloud.
Scope of the Content:
Two major ways to use the Stable Diffusion are:
Overview of key components:-
Let's consider the text2img case and see the various components and their functions. A text is given as input which passes through a text Encoder (Use CLIPText). The Text Encoder produces Token embeddings in latent space representing the features of the text.
These Token embeddings and a random noise is passed through Image Information Creator (Based on UNet + Scheduler). This is the component where the diffusion process takes place. The Image Information Creator produces a processed image tensor in latent space which gets fed to Image Decoder (Based on Autoencoder Decoder) and a high resolution image is produced.
A comprehensive overview from original research paper “High-Resolution Image Synthesis with Latent Diffusion Models”
ClipText for text encoding.
Input: text.
Output: 77 token embeddings vectors, each in 768 dimensions.
UNet + Scheduler to gradually process/diffuse information in the information (latent) space.
Input: text embeddings and noise.
Output: A processed information array
Autoencoder Decoder that paints the final image using the processed information array.
Input: The processed information array (dimensions: (4,64,64))
Output: The resulting image (dimensions: (3, 512, 512)
Launching a GPU on E2E Cloud :-
Congratulations! You have created a GPU node on E2E Cloud successfully. Public and Private IP along with credentials will be sent to your email. If you need any help or any doubt please visit https://docs.e2enetworks.com/
Installation and Running of Stable Diffusion:-
Access the node created by you by SSH:-
ssh root@your_public_ip
Recommended by LinkedIn
This will prompt for password(if you haven't disabled password login). Type your password and press enter.
Required Dependency:-
1. Python 3.10.6 and Git:
2. Code from this repository using git: git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
3. The Stable Diffusion model checkpoint, a file with .ckpt extension, needs to be downloaded and placed in the models/Stable-diffusion directory.
Use :-
Installation on Windows:-
Run webui-user.bat from Windows Explorer as a normal, non-administrator, user.
Installation on Linux:-
To install in /home/$(whoami)/stable-diffusion-webui/, run:
Use:-
bash <(wget -qO- https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh)
This will install and launch the stable-diffusion-webUI which is running on http://127.0.0.1:7860
Create a SSH tunnel to access the webUI on your local machine:-
ssh -L 7860:localhost:7860 username@your_public_ip
Go to browser in your local machine and visit http://127.0.0.1:7860
Generating a Sample image from text prompt:-
Bonus Tip:-
Please visit https://lexica.art/ and search for an image that interests you. Among the search results, find the appropriate result as per your requirement and click on it. It will show the prompt used to create that image and parameters used.