Back/Marketing/GPT-4o
IntermediateMarketingGPT-4o

How to Create an AI-Generated Music Video with GPT-4o and Hedra

Learn to generate an AI music video by creating a base image with GPT-4o, animating it with Hedra, extracting and syncing audio with Adobe Audition and Demucs, and assembling the final video in Kapwing.

From How I AI

How I AI: Anish Acharya's 3 Creative AI Workflows for Music Videos, Book Cataloging, and Personal Finance

with Claire Vo

How to Create an AI-Generated Music Video with GPT-4o and Hedra

Tools Used

GPT-4o

OpenAI's multimodal model

Step-by-Step Guide

1

Generate the Main Image with GPT-4o

Use an image generation model like GPT-4o to create the central visual for your video. Tweak the prompt until you achieve the desired aesthetic. For example, the author started with a prompt for a musician with a guitar but later removed it for an acapella feel.

Prompt:
generate an image of Kurt Cobain playing a Tiny Desk concert
Pro Tip: Experiment with different descriptions in your prompt to fine-tune the style, lighting, and composition of your image.
2

Animate the Image with Hedra

Take the static image generated in the previous step and upload it to Hedra. This tool will animate the photo and allow you to add a custom audio track with synchronized lip movements.

3

Source and Prepare Audio

Obtain the audio track for your video. You can use a tool like 4K Video Downloader to extract audio from an existing video (e.g., a YouTube concert). Then, use a program like Adobe Audition to edit and isolate the specific audio segment you need.

4

Isolate Vocals with Demucs (Optional)

If you want to create an acapella version of your song, use a tool like Demucs to separate the vocal track from the instrumental track. This can create a more intimate and focused feel for the video.

Prompt:
demucs two-stems vocals /path/to/audio.mp3
Pro Tip: This step requires using the command line. Ensure you have Demucs installed and provide the correct file path to your audio file.
5

Assemble the Final Video in Kapwing

Import your animated clip from Hedra into a video editor like Kapwing. You can also generate additional B-roll clips using a video generation model like Veo 3 to add variety and match the overall aesthetic (e.g., '90s grunge'). Combine all clips and the final audio track to produce your music video.

Become a 10x PM.
For just $5 / month.

We've made ChatPRD affordable so everyone from engineers to founders to Chief Product Officers can benefit from an AI PM.