
2026-06-12
Reference Image Prompting: Use Images for Style, Composition, and Materials
Use reference image prompting to control style, composition, lighting, materials, and consistency without creating conflicts between text and image inputs.
Try this workflow in Naviya
Use references when identity, product shape, outfit, or style needs to stay consistent.
Try reference to video
Reference image prompting works best when the image has a job. A reference can control style, composition, lighting, material, identity, or product shape. It cannot reliably do all of those things at once unless the prompt tells the model which signal matters most.
Many failed reference workflows come from treating the image as a finished target. The creator uploads a polished image and expects the model to understand: use this composition, but change the subject, preserve the lighting, ignore the outfit, keep the color mood, and do not copy the background. The model sees a dense bundle of visual features. Without direction, it may grab the wrong ones.
Use this guide when working with image references, preparing first frames for video, or building consistency across a campaign. For video-specific continuity, see the reference to video guide.
Think of references as constraints
A prompt defines the scene. A reference reduces freedom in one or more visual dimensions. The mistake is expecting the reference to replace the prompt.
Useful dimensions include:
| Reference dimension | What it controls |
|---|---|
| Composition | camera position, crop, subject placement, negative space |
| Lighting | direction, contrast, source type, falloff |
| Color | palette, saturation, temperature relationship |
| Material | fabric, metal, glass, skin, surface finish |
| Style | line quality, grain, rendering logic, atmosphere |
| Identity | face, character design, product shape, outfit |
Before generating, choose the dimension. A reference image is strongest when it narrows one part of the problem instead of trying to solve the whole image.
Start with text before adding references
If the text prompt is vague, the reference has to carry too much weight. That usually causes unstable results. Build a text prompt that already describes the subject, scene, and basic camera setup.
Step 1:
Generate without reference first. Stabilize subject, scene, framing, and lighting until the output is roughly 70 percent aligned.
Step 2:
Add the reference image for one clear purpose: composition, material, palette, style, or identity.
Step 3:
Reduce text in the area controlled by the reference so text and image do not fight.
This workflow keeps the reference from becoming a lottery ticket.
Use dimension locking
Dimension locking means naming exactly what the model should take from the image.
Instead of:
Use this reference image and make a red sneaker poster.
Use:
Use the reference image only for the low-angle composition and hard side lighting. Create a red sneaker poster with a different shoe design, matte rubber texture, and clean black background.
Or:
Use the reference image only for brushed metal material and highlight behavior. Do not copy the product shape, background, or color palette.
The word "only" is useful. It gives the reference a boundary.
Avoid text-image conflict
Reference images and text prompts are two instruction channels. If they disagree, the model may average them in a way nobody wanted.
Conflict examples:
- The reference is a shadowy interior, but the text asks for bright noon sunlight.
- The reference is a close-up portrait, but the text asks for a wide landscape.
- The reference has glossy metal, but the text asks for matte clay.
- The reference uses a centered subject, but the text asks for extreme off-center framing.
Sometimes conflict is intentional, but it should be controlled. If you want the composition from one image and the material from the prompt, say so clearly.
Preserve the reference composition: overhead flat lay with generous negative space. Replace the reference materials with matte ceramic, soft linen, and warm paper texture.
For composition language, the AI composition prompts guide can help you write the text side more cleanly.
Do not repeat everything the reference already shows
If the reference already contains a strong neon palette and you add five lines of neon color instructions, you may over-amplify the effect. The result can become oversaturated, noisy, or covered in unwanted glow.
Use text for the missing logic:
- What is the new subject?
- What must change?
- What must stay stable?
- Which reference features should be ignored?
The reference can carry the visual signal. The prompt should direct it.
Multiple references need roles
More references can help, but only if each one has a separate role.
Good reference set:
- Reference 1: face identity.
- Reference 2: outfit or product material.
- Reference 3: lighting and color mood.
Weak reference set:
- Three different faces.
- Two conflicting art styles.
- A product photo, a landscape, and an unrelated fashion image with no role assignment.
Use plain role language:
Use reference one for the person's face and hairstyle. Use reference two for the jacket material and silhouette. Use reference three for warm indoor lighting only. Keep the final composition as a medium close-up portrait.
Reference prompt examples
Composition reference
Use the reference image only for composition: low camera angle, product large in the foreground, city lights small in the background. Create a matte black headphone campaign image with cool night lighting and clean reflections. Do not copy the reference product or colors.
Material reference
Use the reference image only for material behavior: brushed aluminum surface, soft edge highlights, fine micro-scratches. Create a new desk lamp design on a dark table. Keep the lighting simple and avoid copying the object shape.
Style reference
Use the reference image for style only: muted color palette, grainy analog texture, wide negative space, practical window light. Create a quiet portrait of a designer in a studio, different subject and setting.
For light-specific references, combine this workflow with the AI lighting prompts guide.
Reference priority checklist
When a generation uses multiple references, decide which image has priority before prompting. Otherwise the model may average them into a result that satisfies none of them.
Use this order:
- Product or identity reference: what must not change.
- Composition reference: where objects sit in the frame.
- Lighting reference: direction, contrast, and color temperature.
- Style reference: texture, finish, genre, or illustration language.
- Scene reference: room, location, weather, or background mood.
Write the role into the prompt. For example: "Use image A only for product shape, image B only for lighting, and image C only for room mood." If two references conflict, remove one. A glossy studio product reference and a handheld creator selfie can work together only if their jobs are separated.
For video, keep reference roles even stricter. The first frame can carry composition, while Reference to Video protects identity or product shape during motion. If you need to create the first frame from scratch, use AI Image Generator, then animate with Image to Video. For product contexts, combine this method with AI product scene generation so the reference does not have to solve the entire scene.
Try it in Naviya
Use Naviya AI Image Generator to test reference roles quickly. When a still image already has the composition you want, move it into Image to Video. When the reference must preserve identity, product shape, or style across motion, use Reference to Video.
Reference prompting is not about handing the model more material. It is about reducing uncertainty. Give each image a job, remove conflicting text, and let the reference control the dimension it is best suited to control.