Reference Image Prompting: Use Images for Style, Composition, and Materials

Workflow

2026-06-12

Reference Image Prompting: Use Images for Style, Composition, and Materials

Use reference image prompting to control style, composition, lighting, materials, and consistency without creating conflicts between text and image inputs.

reference image promptingAI reference imagesimage promptsAI workflow

Try this workflow in Naviya

Use references when identity, product shape, outfit, or style needs to stay consistent.

Try reference to video

Reference image prompting works best when the image has a job. A reference can control style, composition, lighting, material, identity, or product shape. It cannot reliably do all of those things at once unless the prompt tells the model which signal matters most.

Many failed reference workflows come from treating the image as a finished target. The creator uploads a polished image and expects the model to understand: use this composition, but change the subject, preserve the lighting, ignore the outfit, keep the color mood, and do not copy the background. The model sees a dense bundle of visual features. Without direction, it may grab the wrong ones.

Use this guide when working with image references, preparing first frames for video, or building consistency across a campaign. For video-specific continuity, see the reference to video guide.

Think of references as constraints

A prompt defines the scene. A reference reduces freedom in one or more visual dimensions. The mistake is expecting the reference to replace the prompt.

Useful dimensions include:

Reference dimension	What it controls
Composition	camera position, crop, subject placement, negative space
Lighting	direction, contrast, source type, falloff
Color	palette, saturation, temperature relationship
Material	fabric, metal, glass, skin, surface finish
Style	line quality, grain, rendering logic, atmosphere
Identity	face, character design, product shape, outfit

Before generating, choose the dimension. A reference image is strongest when it narrows one part of the problem instead of trying to solve the whole image.

Start with text before adding references

If the text prompt is vague, the reference has to carry too much weight. That usually causes unstable results. Build a text prompt that already describes the subject, scene, and basic camera setup.

Step 1:

Generate without reference first. Stabilize subject, scene, framing, and lighting until the output is roughly 70 percent aligned.

Step 2:

Add the reference image for one clear purpose: composition, material, palette, style, or identity.

Step 3:

Reduce text in the area controlled by the reference so text and image do not fight.

This workflow keeps the reference from becoming a lottery ticket.

Use dimension locking

Dimension locking means naming exactly what the model should take from the image.

Instead of:

Use this reference image and make a red sneaker poster.

Use:

Use the reference image only for the low-angle composition and hard side lighting. Create a red sneaker poster with a different shoe design, matte rubber texture, and clean black background.

Or:

Use the reference image only for brushed metal material and highlight behavior. Do not copy the product shape, background, or color palette.

The word "only" is useful. It gives the reference a boundary.

Avoid text-image conflict

Reference images and text prompts are two instruction channels. If they disagree, the model may average them in a way nobody wanted.

Conflict examples:

The reference is a shadowy interior, but the text asks for bright noon sunlight.
The reference is a close-up portrait, but the text asks for a wide landscape.
The reference has glossy metal, but the text asks for matte clay.
The reference uses a centered subject, but the text asks for extreme off-center framing.

Sometimes conflict is intentional, but it should be controlled. If you want the composition from one image and the material from the prompt, say so clearly.

Preserve the reference composition: overhead flat lay with generous negative space. Replace the reference materials with matte ceramic, soft linen, and warm paper texture.

For composition language, the AI composition prompts guide can help you write the text side more cleanly.

Do not repeat everything the reference already shows

If the reference already contains a strong neon palette and you add five lines of neon color instructions, you may over-amplify the effect. The result can become oversaturated, noisy, or covered in unwanted glow.

Use text for the missing logic:

What is the new subject?
What must change?
What must stay stable?
Which reference features should be ignored?

The reference can carry the visual signal. The prompt should direct it.

Multiple references need roles

More references can help, but only if each one has a separate role.

Good reference set:

Reference 1: face identity.
Reference 2: outfit or product material.
Reference 3: lighting and color mood.

Weak reference set:

Three different faces.
Two conflicting art styles.
A product photo, a landscape, and an unrelated fashion image with no role assignment.

Use plain role language:

Use reference one for the person's face and hairstyle. Use reference two for the jacket material and silhouette. Use reference three for warm indoor lighting only. Keep the final composition as a medium close-up portrait.

Reference prompt examples

Composition reference

Use the reference image only for composition: low camera angle, product large in the foreground, city lights small in the background. Create a matte black headphone campaign image with cool night lighting and clean reflections. Do not copy the reference product or colors.

Material reference

Use the reference image only for material behavior: brushed aluminum surface, soft edge highlights, fine micro-scratches. Create a new desk lamp design on a dark table. Keep the lighting simple and avoid copying the object shape.

Style reference

Use the reference image for style only: muted color palette, grainy analog texture, wide negative space, practical window light. Create a quiet portrait of a designer in a studio, different subject and setting.

For light-specific references, combine this workflow with the AI lighting prompts guide.

Reference priority checklist

When a generation uses multiple references, decide which image has priority before prompting. Otherwise the model may average them into a result that satisfies none of them.

Use this order:

Product or identity reference: what must not change.
Composition reference: where objects sit in the frame.
Lighting reference: direction, contrast, and color temperature.
Style reference: texture, finish, genre, or illustration language.
Scene reference: room, location, weather, or background mood.

Write the role into the prompt. For example: "Use image A only for product shape, image B only for lighting, and image C only for room mood." If two references conflict, remove one. A glossy studio product reference and a handheld creator selfie can work together only if their jobs are separated.

For video, keep reference roles even stricter. The first frame can carry composition, while Reference to Video protects identity or product shape during motion. If you need to create the first frame from scratch, use AI Image Generator, then animate with Image to Video. For product contexts, combine this method with AI product scene generation so the reference does not have to solve the entire scene.

Try it in Naviya

Use Naviya AI Image Generator to test reference roles quickly. When a still image already has the composition you want, move it into Image to Video. When the reference must preserve identity, product shape, or style across motion, use Reference to Video.

Reference prompting is not about handing the model more material. It is about reducing uncertainty. Give each image a job, remove conflicting text, and let the reference control the dimension it is best suited to control.