The Way to Dusty Death

In ProgressDrama
Medium (5-20 min)
View Project

Text-to-Video Showdown: Grok vs Veo 3.1 vs Kling vs Midjourney

John PolacekJohn Polacek
November 15, 2025(updated November 16, 2025)
Text-to-Video Showdown: Grok vs Veo 3.1 vs Kling vs Midjourney

I'm starting a new AI Video project for "The Way to Dusty Death", my hypothetical 4th act to the recent Netflix thriller “House of Dynamite”, about Nuclear War.

To get started, I thought I would try the same prompt with different AI Text to Video platforms. It was interesting to see the results!

Ok, so Spoiler Alert! If you want to watch the movie, go do that now. I'll try not to give away too much, but if you want to not have anything spoiled, stop reading now.

House of Dynamite is a hyper realistic depiction of a nightmare scenario featuring a nuclear missile flying toward the United States, specifically Chicago - a place I like to call home!

The first prompt I'm using in this might be a bit disturbing, but we are picking up from where the movie left off. It never specifically said there was a detonation, but the strong implication was that it was likely, and that is where my AI film is going to begin.

The first prompt will be providing to these AI video generators is simply this: A nuclear missile strikes the city of Chicago. A blinding white explosion.

Now, let’s review from worst to first (in my opinion)...

Google Veo 3.1

I’ve been using Google Veo 3.1 extensively in my other project. It does very well with Frames to Video, but Text to Video was a miss. The missile was super cheesy and the explosion was underwhelming.

Kling 2.5

Kling 2.5 did not do much better. At least there was no cheesy missile, but the explosion was weak!

Midjourney

Midjourney does not seem to let you do text straight to video. First you do an image, then from there you can create the video. So I generated 4 images then picked the one I thought was the best.

The explosion was the biggest so far, but somehow these building all stay intact.

Grok Imagine

This was a close call. Personally, I found the video that Grok generated was the best overall.

The first video it generated was so-so, but it also generated a bunch of images to generate a new video from. I picked one and thought overall it was the biggest and most effective for the story I'm telling. The Midjourney video was nice, but I found Grok's to be a little more dramatic, mostly due to showing the actual destruction of the buildings.

When you look a little closer, what's the deal with the boats on Lake Michigan? They all just simultaneously went WTF?!...GTFO!

Midjourney + Google Veo 3.1

For the best result, I ultimately did a combo with an image from Midjourney turned into video with Veo 3.1 and some prompt tweaking.

I still couldn't get the buildings to break up. I love the Sears Tower but it would not survive a nuclear blast. Ultimately to pull off this first scene, I'll have to trim the generation, but we have about four seconds of decent footage.