AI Video Prompt Guide: A Simple Structure for Better Motion
Prompting

2026-06-12

AI Video Prompt Guide: A Simple Structure for Better Motion

Use this AI video prompt structure to improve motion, camera control, lighting, character consistency, and short-form video results.

ai video promptsprompt guidecamera controlvideo generation

Try this workflow in Naviya

Use references when identity, product shape, outfit, or style needs to stay consistent.

Try reference to video

AI video prompting is different from image prompting. A still image prompt can describe a finished composition. A video prompt has to describe what changes over time.

The best prompts are not longer. They are clearer.

If you want examples built for text-only scene generation, use text to video prompt examples. If you already have a first frame, use image to video prompts. If the output is a campaign asset, use AI video ads prompts.

The five-part video prompt

Use this structure:

  1. Subject.
  2. Scene.
  3. Camera.
  4. Motion.
  5. Constraints.

Here is the template:

Subject: [who or what is visible]
Scene: [place, style, lighting, mood]
Camera: [shot size and camera movement]
Motion: [what changes during the clip]
Constraints: [what must stay stable]

Subject

The subject should be concrete. Avoid vague phrases like "a cool character" or "a beautiful scene." Name the visible details.

Better:

Subject: a young creator in a black streetwear jacket holding a glowing camera rig.

Weaker:

Subject: a cool person making content.

Scene

The scene sets the visual world. Include location, lighting, and style.

Examples:

  • Dark studio with violet rim light and soft haze.
  • Rainy neon street at night with reflections on the pavement.
  • Minimal product table with black acrylic surface and controlled highlights.
  • Anime city rooftop at sunset with warm clouds and wind.

Camera

Camera language is the easiest way to improve video quality.

Useful camera phrases:

  • Slow push-in.
  • Locked tripod shot.
  • Gentle orbit.
  • Side tracking shot.
  • Handheld documentary feel.
  • Low angle hero shot.
  • Close-up with shallow depth of field.

Do not stack too many camera moves in one clip. A five second video usually needs one camera idea.

Motion

Motion is what the viewer notices second. Keep it specific and believable.

Good motion:

  • Hair and jacket move in wind.
  • The subject turns slightly toward camera.
  • Light sweeps across the product surface.
  • The camera pushes through floating particles.
  • A character blinks once and smiles subtly.

Risky motion:

  • Full body fight choreography.
  • Rapid transformation into another character.
  • Complex object interaction with hands.
  • Multiple people crossing and changing positions.

Constraints

Constraints tell the model what not to break.

Examples:

  • Preserve face and outfit.
  • Keep the product logo readable.
  • Keep the same composition.
  • No extra people.
  • No sudden scene changes.
  • Avoid warped hands.

Use constraints when the image, identity, or product matters.

Three ready-to-use prompts

Cinematic portrait

Subject: a cinematic portrait of a silver-haired anime character in a black jacket.
Scene: dark studio, violet rim light, soft haze, floating particles.
Camera: slow push-in from medium portrait to close-up.
Motion: natural blink, subtle head turn, hair moving in a light breeze.
Constraints: preserve face, outfit, and composition. No sudden scene change.

Product reveal

Subject: a matte black headset on a reflective studio table.
Scene: premium dark product photography, violet edge light, clean background.
Camera: slow orbit from front-left to center.
Motion: light sweep across the product surface, faint smoke in the background.
Constraints: keep product shape stable, no extra objects, keep logo readable.

Social creator clip

Subject: a creator standing on a neon city rooftop holding a phone.
Scene: anime-inspired night skyline, wet reflections, blue and violet lights.
Camera: handheld slow push-in.
Motion: jacket moves in wind, phone screen glows, background signs flicker.
Constraints: preserve character identity, avoid distorted hands.

The prompt debugging loop

If the video fails, diagnose the failure:

  • Identity changed: strengthen reference and constraints.
  • Motion is weak: simplify the scene and make motion more explicit.
  • Scene changed: ask for locked composition.
  • Camera is chaotic: use one camera movement.
  • Hands are broken: avoid hand actions or crop tighter.

The goal is not to write a perfect prompt once. The goal is to build a repeatable prompt system that improves with each generation.

Prompt patterns by output type

Different video jobs need different prompt priorities.

Output First priority Safer motion
Product reveal preserve shape and material light sweep, slow push-in, slight orbit
Creator clip preserve face and hands blink, slight head turn, handheld push-in
Anime edit preserve art style and character design hair movement, rain, background parallax
Social ad first-second clarity hook motion, caption-safe framing
Cinematic scene camera and atmosphere dolly, pan, foreground movement

For product work, pair this structure with the product image to video guide. For short-form social placements, use the AI video hook examples to choose the first second before writing the full prompt.

Before and after rewrite

Weak prompt:

Make a cinematic video of a perfume bottle with cool motion and luxury style.

Stronger prompt:

Subject: glass perfume bottle centered on a black reflective table.
Scene: dark premium studio with subtle violet rim light.
Camera: slow push-in from medium product shot to closer hero framing.
Motion: a narrow highlight sweeps across the bottle surface while the product stays stable.
Constraints: preserve bottle shape, cap, color, and label area. No fake text or extra objects.

The stronger version does not just add words. It gives the model a subject, a physical light path, a single camera move, and clear protection rules.

Build prompts in layers

A strong video prompt usually has five layers. The first layer is subject truth: what must remain recognizable. The second is scene logic: where the subject is and why the light behaves that way. The third is camera motion: one move that fits the format. The fourth is subject or environment motion: what changes during the clip. The fifth is protection: what should not change.

When a generation fails, repair the layer that failed. If the product warps, strengthen subject truth and protection. If the clip feels still, improve the motion layer. If it looks random, simplify scene logic. If the camera is chaotic, replace an orbit, dolly, and zoom combination with one clean push-in. This layered approach is faster than adding more cinematic adjectives.

For business videos, also define the job of the clip. A product-page loop needs stability. A social hook needs immediate motion. A brand teaser needs mood and recognition. A tutorial or explainer needs clarity. The same visual idea can require different prompt priorities depending on where the video will be used.

Use duration as a constraint, not an afterthought. A four-second clip should have one idea: reveal, gesture, light sweep, or loop. A six-to-eight-second clip can hold a beginning, middle, and end. Longer prompts should usually become shot lists instead of one overloaded generation. If the prompt contains three locations, two camera moves, and several actions, split it into smaller clips and let editing create the sequence.

Keep a prompt library by use case. Save examples for product reveals, character portraits, UGC-style clips, environment scenes, and motion posters. When a new brief arrives, start from the closest proven structure and change subject, setting, and motion deliberately. This is faster and more reliable than writing every prompt from a blank page.

Review every saved prompt with its output. A library is useful only when the example still performs well, not when it merely sounds impressive.

Try it in Naviya

Use Naviya AI Video Generator for prompt-first scenes. If you already have a still image, use Image to Video so the first frame anchors the subject before motion begins.