Skip to main content

Prompt guide

Learn how to write video gen prompts that generate great results

Bartol Freskura avatar
Written by Bartol Freskura
Updated this week

General guide

1. Volume beats perfection

Stop trying to create the perfect video. Generate 10 decent videos and select the best one. This approach consistently outperforms perfectionist single-shot attempts.

2. Systematic beats creative

Proven formulas + small variations outperform completely original concepts every time. Study what works, then execute it better.

3. Embrace the AI aesthetic

Stop fighting what AI looks like. Beautiful impossibility engages more than uncanny valley realism. Lean into what only AI can create.

The 6-part prompt structure:

[SHOT TYPE] + [SUBJECT] + [ACTION] + [STYLE] + [CAMERA MOVEMENT]

This baseline works across thousands of generations. Everything else is variation on this foundation.

Front-load important elements

Beautiful woman dancing” ≠ “Woman, beautiful, dancing.” Order matters significantly.

One action per prompt rule

Multiple actions create AI confusion. “Walking while talking while eating” = chaos. Keep it simple for consistent results.

Systematic seed approach

Random seeds = random results.

Suggested workflow:

  1. Test same prompt with seeds 1000-1010

  2. Judge on shape, readability, technical quality

  3. Use best seed as foundation for variations

  4. Build seed library organized by content type

Camera movements that consistently work

  • Slow push/pull: Most reliable, professional feel

  • Orbit around subject: Great for products and reveals

  • Handheld follow: Adds energy without chaos

  • Static with subject movement: Often highest quality

Avoid: Complex combinations (“pan while zooming during dolly”). One movement type per generation.

Style references that actually deliver

Camera specs: “Shot on Arri Alexa,” “Shot on iPhone 15 Pro”

Director styles: “Wes Anderson style,” “David Fincher style”

Movie cinematography: “Blade Runner 2049 cinematography”

Color grades: “Teal and orange grade,” “Golden hour grade”

Avoid: Vague terms like “cinematic,” “high quality,” “professional”


Image to video guide

  1. Basic structure: Since text-to-video already has a scene, try to reduce (or even avoid) descriptions of static/unchanged parts. When clearly pointing out moving objects, describe more of the moving parts, including the movement of the main body, the movement/change of the background, and the movement of the camera.

  2. Simple and direct: Try to use simple words and sentence structures. The model will expand the prompt based on our expressions and understanding of the image, generating videos that meet expectations.

  3. Feature description: When the main body has some prominent features, add the prominent features to better position the main body, such as "an old man," "a woman wearing sunglasses," etc. When describing movements, key adverbs of degree must be clear, such as "quickly," "with large amplitude."

  4. Follow the picture: You need to write based on the content of the input picture, and you need to clearly write the main body and the action or mirror movement you want to do. It is necessary to pay attention that the prompt words should not contradict the facts of the picture content/basic parameters.


    Bad examples:

    • There is a man in the picture, but the prompt is: "a woman is dancing""

    • The image background is grassland, but the prompt word says "a man is singing in a coffee shop"

  5. Negative prompts do not take effect : The model does not respond to negative prompts.


Did this answer your question?