Cars are neat. We spend a lot of time in them, looking at them, paying for them, making infrastructure for them. So why not dedicate some GPU power to the beautiful death machines that roam the streets every day.
I had one thing in mind for this animation: a perfect loop going all the way around the car. Easy enough, right? All I had to do was get the 360 degree video, perform img2img on each frame of the video, and stitch them all back together. It ended up being a little more involved than this, but overall that was the concept.
In order to get a 360 degree video around the car, I needed to either photograph the car myself with a camera rig/drone or find a reliable source of consistent images. Luckily, Carvana already does 360 degree imaging on their car listings. It’s accessible on most listings on their website, and it imitates motion by placing the vehicle on motorized spinning platform, taking photos at specified intervals. Each frame is displayed in their webapp, and you can simply use the network tab with inspect element to find all of the image frames that make up the full animation.
Downloading them all was a little tedious, especially since there are 63 images for each rotation, so I made a python script to automate it. This is possible because the image links follow an incremental pattern. If you want to use this yourself, you’ll need to replace the appropriate strings for the beginning and end of the image URLs in the script. I also had it generate a gif from the frames directly.
Find the script here: https://pastebin.com/ZdQfdmRP
import requests
from PIL import Image
from io import BytesIO
# Updated base URL and suffix for the new set of images
base_url = "https://vexgateway.fastly.carvana.io/executions/105356595/FLOOR_CLEANER/cleaned/clean_"
url_suffix = ".jpg?v=1705586461.054&crop=60.79p,51.29p,x18.27p,y29.79p&quality=75&optimize=medium&width=2000"
images = []
# Download images and provide console output
for i in range(1, 64): # There are 63 images
url = f"{base_url}{str(i).zfill(3)}{url_suffix}"
print(f"Downloading image {i}...")
response = requests.get(url)
if response.status_code == 200:
images.append(Image.open(BytesIO(response.content)))
print(f"Image {i} downloaded successfully.")
else:
print(f"Failed to download image {i}")
# Create GIF and provide console output
if images:
gif_filename = "output.gif"
print("Creating GIF...")
images[0].save(
gif_filename,
save_all=True,
append_images=images[1:],
duration=100, # duration between frames in milliseconds
loop=0 # loop forever
)
print(f"GIF created successfully: {gif_filename}")
else:
print("No images to create GIF.")
For the image generation, I used the Dreamshaper 8 checkpoint (based on Stable Diffusion 1.5). The reason I chose SD1.5 over SDXL Turbo is simply the lack of development and flexibility in the SDXL version of AnimateDiff.
As with my other controlled animations, ComfyUI is the software of choice, or perhaps it’s the software of necessity, since Automatic1111 does not seem to work with controlnet and animatediff together. The workflow is based on Latent Vision’s basic prompt travel template which is available here: https://openart.ai/workflows/matt3o/template-for-prompt-travel-openpose-controlnet/kYKv5sJWchSsujm0zOV0
I have adapted the workflow to suit my needs a bit better, but the core of it is the same with the v3 motion module and adapter lora. I also used a lineart controlnet and (optionally) an ipadapter for style transfer.
Since we are still using an empty latent image and the only control over the car comes from the prompt and lineart controlnet, the color of the car in the reference is irrelevant. We can simply change the color of the car in the positive prompt.
One issue that arises is that when specifying the make/model in the prompt, it may try to only generate front-facing images and ignore when the vehicle is supposed to be facing away from the camera. To remedy this, I described the rough position of the car in the batch prompt scheduler. Since all the animations are 63 frames, the same prompt should be able to be used for any car with a 63 frame rotation.
Although this one may not be as clear, it’s probably my favorite. It’s a rendition of my brother’s Model 3 reimagined in MarioKart’s rainbow road using ip-adapter for style transfer.