Blog · AI Video Generation

The Veo 6-Trait Character Lock: How to Keep the Same Character Across 12+ AI Video Clips

AI video models forget your character between clips by default. The 6-trait lock keeps the same person recognizable across 8, 12, or 20 separate generations.

By Cameron Jo'van··9 min read
TL;DR
  • Six traits, used verbatim every time: age + gender + hair + build + distinctive feature + attire.
  • Lock them as a single repeated string. Don't paraphrase between clips. Don't reorder.
  • Combined with text-to-video (NOT image-to-video — see audio bug article), the 6-trait lock makes narrative AI video viable for the first time.

The single biggest blocker to using AI video for real narrative content in 2026 is character inconsistency. Generate clip 1, get a great-looking protagonist. Generate clip 2, get a completely different person who's supposed to be the same character. Multiply across 8-12 clips and you have unusable footage for any storytelling that requires continuity.

The 6-trait character lock solves this. It's the operator-tier prompting pattern that keeps a recognizable character across 8, 12, even 20 separate Veo generations. This article is the exact pattern, why each trait matters, and the full workflow.

The Six Traits (In Order)

The lock uses six attributes, in this order, verbatim every reference:

1. Age (specific). "mid-thirties" or "early twenties" or "late forties." Not "young" or "older" — too vague.

2. Gender. "Woman" or "man." Includes presentation, not biological technicality.

3. Hair (color + length + style). "Long dark hair pulled into a low ponytail" or "short reddish-brown hair, slightly tousled." Three elements: color, length, style.

4. Build. "Athletic build" or "slim build" or "stocky build." One word. Don't elaborate.

5. Distinctive feature. ONE memorable visual element. "Faint freckles across the bridge of the nose" or "small scar above the left eyebrow" or "thin wire-rim glasses." This is the identity anchor — the thing that makes this person THIS person.

6. Attire. What they're wearing in this scene. "Wearing a charcoal wool coat over a cream turtleneck." Specific colors. Specific items.

The full lock as a single string:

"A woman in her mid-thirties, long dark hair pulled into a low ponytail, athletic build, faint freckles across the bridge of her nose, wearing a charcoal wool coat over a cream turtleneck"

That string, used verbatim in every clip's prompt, locks the character.

Why The Order Matters

Veo's prompt parser weights early tokens more heavily than later ones for character generation. Traits at positions 1-4 (age, gender, hair, build) form the base identity. Traits at positions 5-6 (distinctive feature, attire) add specificity.

Reordering breaks the lock. "Long dark hair, mid-thirties woman, athletic build..." parses differently than "A woman in her mid-thirties, long dark hair, athletic build..." The first generates inconsistent hair details across clips because hair landed in position 1 where the parser expected age framing.

Stick to the canonical order: Age → Gender → Hair → Build → Distinctive feature → Attire.

Why Each Trait Matters

Age (specific): Generic age terms produce wildly different ages across generations. "Young woman" can be 18 or 28. "Mid-thirties" pins to 32-37.

Gender: Self-explanatory. Without it, the model occasionally drifts.

Hair (3 elements): Hair is the most visually salient identity feature. Specifying color, length, and style produces consistent hair across clips. Missing any of the three causes variation.

Build: Body shape varies dramatically without this anchor. Athletic vs slim vs stocky reads differently on camera and matters for posing.

Distinctive feature: The one element that says "this person, not someone who looks similar." Freckles, a scar, glasses, a specific tattoo, a birthmark. ONE feature. Multiple distinctive features dilute the effect.

Attire: Within a scene, attire locks visual continuity. Between scenes (where outfits change), the other 5 traits hold identity while attire varies.

A Working Lock Example

For a 12-clip short film about a woman returning to her childhood home:

Character lock: "A woman in her mid-thirties, long dark hair pulled into a low ponytail, athletic build, faint freckles across the bridge of her nose"

Scene 1 attire: "wearing a charcoal wool coat over a cream turtleneck"

Scene 7 attire: "wearing dark jeans and a faded gray sweater"

Scene 12 attire: "wearing a navy windbreaker and hiking boots"

Across 12 generations using these consistent 5 base traits + scene-specific attire, the same character appears recognizably in every clip.

The Full Veo Prompt Structure (With Lock)

The 6-trait lock fits into the broader Veo prompt structure that hits 70%+ usable output:

[Style early] + [6-trait character lock] + [Action and dialogue] + [Camera direction] + "No music. No subtitles."

Filled in:

"Cinematic short film with warm color grading and shallow depth of field. A woman in her mid-thirties, long dark hair pulled into a low ponytail, athletic build, faint freckles across the bridge of her nose, wearing a charcoal wool coat over a cream turtleneck, walks slowly into a dimly lit hallway with shafts of warm afternoon light. She pauses, voice low and weighted — I never thought I'd come back here. The camera holds at chest height for two beats, then follows her slowly down the corridor. No music. No subtitles."

That prompt structure, run through text-to-video (NOT image-to-video — see the Veo audio bug article), reliably generates a usable clip with the locked character and clean dialogue audio.

Multi-Character Scenes

For scenes with multiple characters, lock each with their own 6-trait string and reference both in the same prompt:

Character A: "A woman in her mid-thirties, long dark hair pulled into a low ponytail, athletic build, faint freckles across the bridge of her nose, wearing a charcoal wool coat over a cream turtleneck"

Character B: "A man in his early forties, short salt-and-pepper hair, stocky build, deep crow's feet around the eyes, wearing a brown leather jacket over a navy button-down"

Combined in a scene:

"Cinematic short film with warm color grading. [Character A lock] sits across a small kitchen table from [Character B lock]. She speaks softly, voice tight — You said you'd be back by Christmas. He looks down at the table for a beat. The camera holds in a medium two-shot. No music. No subtitles."

Tested reliably with up to 3 characters in the same scene. Beyond 3, consistency degrades on the secondary characters.

Where The Lock Breaks Down

A few situations to know:

Aging the character mid-story. Time-jump scenes where the character is shown 10 years later. The 6-trait lock won't naturally age the character — you need a SECOND character lock for the older version, then cut between them.

Children growing up. Same problem. Each age is a different lock.

Significant transformation scenes. A character cutting their hair dramatically mid-story. The lock works on either side, but the transformation moment itself often requires a different prompting approach (image-to-image generation, or accepting one mismatched clip during the cut).

Long-form (30+ clips of the same character). Beyond ~30 clips, even with the lock, model drift accumulates. The character at clip 35 is still recognizable but starts looking subtly different. Acceptable for most short-form work; problematic for feature-length narratives.

The Production Workflow

A working operator workflow for a 12-clip short film:

Step 1 — Lock the character (15 min). Write the 6-trait lock string. Generate 3 test clips with different scenes/poses. Verify identity holds.

Step 2 — Storyboard the 12 scenes (45 min). Write the 12 prompts. For each: scene-specific attire (or "wearing the same outfit as previous scene"), action, dialogue, camera direction. Use the canonical Veo structure.

Step 3 — Generate (30-45 min). Run all 12 prompts. Expect ~70-80% first-try usable. Re-generate failures.

Step 4 — Assemble (1-2 hours). Edit in DaVinci, Premiere, Final Cut. Add transitions, music, color matching.

Total time for a 60-90 second short with consistent character: ~4 hours operator time + ~$5-8 in API spend.

The same content shot live (actors, location, equipment) would cost $5,000-50,000 and take 1-3 weeks.

What This Enables

Real narrative AI video. Not random clips. Stories with characters who exist across multiple scenes.

Use cases the 6-trait lock unlocks:

  • Short narrative films for YouTube
  • Branded character-led ad campaigns
  • Recurring character formats (the same protagonist appears weekly)
  • Multi-scene product demos with a consistent demo subject
  • Training/explainer videos with a host character

Before the lock pattern, AI video could produce single clips but not narratives. With the lock, narrative is viable.

The Cross-Sell

The full Veo for Creators playbook ($6.99) includes the 6 mechanical prompting rules, 12 paste-and-ship shot recipes, the full character-lock pattern with multi-character variants, the failure-mode debugging chart, and the cost calculator for typical operator workflows.

$6.99 once. Most operators recoup the cost on the first multi-clip project where the lock saves 4+ failed re-generations.

The actionable next step: pick a 3-clip test sequence (could be anything — a character walking through three different rooms, a character at three different times of day). Lock the character with the 6-trait pattern. Generate the three clips. Watch the consistency. The proof of the pattern is in the visible result.

Frequently Asked Questions

Why six traits specifically? Why not three or ten?

Three is insufficient — the model has too much latitude on under-specified attributes. Ten over-specifies and starts producing rigid, stilted character output. Six is the empirically derived sweet spot: enough to lock identity, loose enough to let the model produce natural variation in expression and pose.

Does this work with image-to-video too?

Partially. Image-to-video gives you visual reference but suffers the audio bug (covered in the [Veo audio article](/blog/veo-audio-silent-killer)). For dialogue content, use text-to-video with the 6-trait lock — the lock substitutes for the reference image.

Will the character look identical across clips?

Recognizably the same — yes. Pixel-identical — no. Expect the same person at different angles, with slightly different expression. This matches how characters look across cuts in real film.

Can I use this for multiple characters in the same scene?

Yes. Lock each character with their own 6-trait string. Reference both in the same prompt. Tested up to 3 characters reliably; 4+ starts losing consistency on the secondary characters.

What if the character needs to wear different clothes in different scenes?

Lock the 5 non-attire traits and vary attire as needed. The 5-trait sub-lock keeps identity stable while wardrobe changes. Be explicit: 'wearing X in this scene' rather than implying.

Does this work in Sora or Kling?

Sora has stronger native character consistency within a single clip but drifts across separate prompts. The 6-trait lock helps but less reliably than in Veo. Kling has weaker consistency overall; the lock helps but doesn't fully solve.

Should I include personality traits in the lock?

No. Personality is action and dialogue, not visual identity. Adding personality to the visual lock confuses the model. Show personality through what the character does and says in each scene.