Prompts 101
25 Apr 2024
Intro
The first step for any video generation is building your prompt. It's both the easiest part of the process, and the hardest. So understanding what goes into a good prompt, as well as the common pitfalls to avoid, is vitally important.
A text to video generation is broken down into two parts. The first frame, and the animation. Your prompt first guides the AI model to generate an image such as "a horse walks on the beach", then uses the prompt a second time in addition to the newly generated image to guide the AI model in generating the video. Meaning action words like "jumping", "running", "exploding" are particularly good at creating high quality results since both the image they generate, and the motion requested match quite nicely.
Though not all prompts can be that simple. To get unique, high quality generations we'll have to be a lot more specific, and the longer our prompts get, the more important the order of our keywords becomes. So let's break down what makes a great prompt, and how to make the most of Haiper text to video.
Start Simple
We typically want to always start with a broad instruction on what you wish the video model to generate. That is, the shortest and most direct description of the image, an action taken, or a motion. The prompt also doesn't require any please or thank yous. Fiddly syntax, like "--IW" or hyperlinks. There would be no great benefit if you were to type out a whole ChatGPT style request with links and questions. From the examples below, we got a far better result with some interesting words, as opposed to the hyperlink that just picked up on the word "dog", and "Husky".
The first part of your prompt is going to have the greatest effect on your result, so lock this in first. Start simple and play around with some strong action keywords before moving on to style, details, and camera movements.
Give References
This is where we follow up with the "in the style of" part of our prompt. Think of the art styles you would like to see, locations and settings, popular movie genres you'd like to style towards and reference them directly. If you were to try and describe these things through purely visual cues alone, it would be nearly impossible, and your prompt would become unreadable.
For example, prompting "in the style of a 1980s grindhouse horror" would be a lot more direct, and a hell of a lot easier than "filmed in 1984 in washed out muted colours that are scary and ominous, the film is grainy and the tape is covered in dust and dirt making the picture quality quite poor."
An interesting quirk of the model is that it will also understand reference to popular media when given the correct inputs. Feeding in the first paragraphs of "The Hobbit", and "Harry Potter" will generate contextually accurate shots, despite many of the keywords not appearing. Note the lack of worms, mud, or sand. So try using this to your advantage.
Be Detailed
Your descriptions can be as broad or as detailed as you desire, so go wild! The model will interpret your keywords in a quite direct and literal way, so try not to use any metaphor or get caught up with keywords that are producing unexpected results.
It's recommended that you experiment with as many different ways of requesting the same image as you can think of. Who knows, you might stumble upon a keyword that generates something truly unique.
Get Cinematic
"Low angle", "high angle", "Drone shot". You can include the instruction for your camera angle and movement anywhere in your prompt, though typically best practice is to include the keywords in the most grammatically accurate way you can. For example: "a drone shot of a statue in a park", "a long distance telescopic shot of a bronze statue in a park". Try out some descriptive sentences for camera movements and compositions to see what you get.
These are just the most basic examples. As the model expands, so will it's understanding of more niche concepts!
Animals!
Characters!
Animation!
..and more! Keep exploring new keywords and trying different combinations, experimenting with large batches to develop your own style. Make sure to share your favourites on Discord!