Google's Veo 3.1 is the best video model of 2026 for short ads. It does lip-sync, renders logos and text in frame, understands product packaging, and shoots 9:16 for Reels and TikTok without a hitch. Sora 2 is stronger for cinematic, but for marketing formats under 6 seconds, Veo beats it on brand recognition. It's a specialized model. Its strength is commercial videos with CTAs, and it's unmatched in that area right now.

Below are 8 techniques I've refined with Veo 3.1 in Quantium's video engine, using real client videos: cosmetics, automotive, FinTech, food, and mobile apps. Each technique boosts a specific metric: attention, recognition, or CTA conversion.

Context: Veo 3.1 in Quantium creates a 6-second video with synced audio for 50 credits. That's the same price as a static product shot in a studio. Only you get a ready-to-go ad, not just a series of photos. For an average brand shooting 4-5 creatives a month, that's savings of around 50-80 thousand rubles for comparable quality. There's one condition: the prompt has to be written right. So, here are the techniques, not just "how cool it is".

1. 3-Act Structure in 6 Seconds

The most common mistake is rendering 6 seconds of the same thing. That kills retention. Veo 3.1 can handle drama within a short video if you ask it to. Structure: 1 second — hook, 3 seconds — product, 2 seconds — final shot with CTA.

6-second commercial in three acts:
- Act 1 (0-1s): close-up of a hand reaching for a glowing phone screen
- Act 2 (1-4s): app interface unfolds, smooth animated UI elements,
  user smiles seeing the result
- Act 3 (4-6s): final logo reveal on dark background with tagline
  "Faster banking. Less waiting." appearing in elegant typography
Result: A video with internal dynamics that people watch to the end, instead of swiping past.

2. On-Screen Brand Visuals

Veo 3.1 renders text and packaging better than any other video model out there. Use it: describe packaging in detail (color, shape, font, material), specify the logo's position in frame, write out visible product text. The model doesn't mix up letters and keeps the label stable for all 6 seconds.

A frosted glass perfume bottle with brand label "Aurora" in
gold serif typography, the bottle stands on a marble surface,
soft pink rose petals scattered around, the label remains
sharp and readable throughout the entire shot
Result: A product shot with a readable label, ready for Instagram Stories.

3. Camera Flow for the Product

The type of camera movement sets the video's mood. Slow push-in works for dramatic product reveals. Orbit shots are for 360° presentations (watches, bottles, gadgets). Top-down with a pull-up is for food and cosmetics. Whip-pans between shots are for dynamic sports ads.

360-degree orbit around a luxury watch on a black velvet surface,
camera moves smoothly clockwise, watch face catches light at the
3-second mark revealing brand logo, end with hero front-facing shot
of the watch dial in sharp focus
Result: A "premium review" product shot that sells without words.

4. Facial Expression & Text Sync

Veo 3.1's main feature is precise lip-sync for characters with voiceover text. For it to work, write the line in quotes and the emotion separately. Don't just say "person talks about product." Be specific: character says «...» with a warm smile. The model syncs lip movements with phonemes in any of 35+ languages.

Medium close-up of a young woman with natural makeup, she looks
into the camera with confidence and says: "Skincare that actually
works. Try Lumira free for 7 days." She smiles slightly at the end.
Soft daylight from the left, neutral background
Result: A convincing testimonial shot with realistic lip-sync in Russian or English.

5. Product Texture

Veo 3.1 can convey tactility: glass sheen, matte metal, moisture on a bottle, creamy cosmetic texture, steam from hot food. This is critical for food and beauty — without texture, the video looks like stock footage. Describe the material and its behavior clearly.

Close-up macro of a chocolate truffle being cut open, molten
caramel slowly oozes from the centre, glossy chocolate surface
catches warm side light, subtle steam visible above the cut,
shallow depth of field with extreme detail on texture
Result: A mouth-watering food ad shot that Sora 2 can't replicate.

6. Voiceover Tone

Veo 3.1 can change voice character. Specify the tone clearly: warm and reassuring for health and FinTech, energetic and youthful for apps and game dev, neutral professional for B2B, whispered intimate for premium cosmetics and perfume, dramatic deep for automotive and sports.

Voiceover style: warm and reassuring female voice, calm pace,
slight British accent. Voice says: "Your money. Protected.
Available 24/7." Background: soft ambient music, no harsh sounds.
Result: A voice with the right tone that doesn't sound robotic.

7. Storytelling in 6 Seconds

Veo 3.1 can tell a mini-story if it has a point A, point B, and a conflict. Don't describe the scene statically — describe a transformation. The "From X to Y" structure works best: before and after, problem and solution, dull moment to vibrant.

6-second transformation story: first 2 seconds show a tired
office worker staring at a cluttered desk with stacks of paper.
Then a smartphone appears in frame, app interface unfolds.
Last 2 seconds: same worker now relaxed, papers gone, smiling
at the clean screen showing organized tasks. Subtle uplifting music.
Result: An emotional arc in one video, higher conversion than a static demo.

8. Format & Platform

Veo 3.1 composes for the format you specify. 9:16 vertical — for Reels, TikTok, Stories. 1:1 square — for Instagram feed and LinkedIn. 16:9 horizontal — for YouTube, web banners, YouTube Shorts with horizontal content. Without a format specified, the model defaults to 16:9, and you'll have to manually crop vertical videos, losing composition.

Format: 9:16 vertical for Instagram Reels and TikTok.
Composition: central subject with empty space on top and bottom
for caption overlays. Safe zones for UI elements respected.
Result: A ready-to-use vertical video without losing composition from cropping.

Ready-Made Templates for 5 Industries

Cosmetics & Beauty

Macro shot of a glass cream jar with cream texture visible inside,
golden label "Lumira" in elegant typography, soft pink studio
lighting from above, pearl droplets on the jar surface, slow
push-in over 5 seconds, female voiceover: "Skin that glows.
Naturally." 9:16 vertical, premium beauty commercial style

Automotive

Cinematic 6-second car commercial: low angle tracking shot of
a black electric SUV driving on a coastal mountain road at sunset,
camera dollies alongside the vehicle, lens flares from sun,
brand logo "Nova EV" appears in the final 2 seconds with tagline
"Drive the future." Dramatic deep male voiceover, ambient strings

FinTech & Banking

Clean modern interior, a person taps their phone on a payment
terminal, success animation appears on screen, subtle confetti effect,
brand interface "Finto" visible. Warm reassuring female voiceover:
"Pay in 0.3 seconds. Tax-free transfers worldwide." Soft daylight,
9:16 vertical, neutral professional aesthetic

Food & Delivery

Top-down shot of a sizzling pan with vegetables and herbs,
steam rising in soft golden light, camera slowly rises and tilts
to reveal a finished dish being placed on a wooden table, hands
sprinkling fresh basil. Voiceover: "Restaurant taste. Delivered hot.
In 25 minutes." Energetic but warm tone, 16:9 horizontal

Mobile Apps

6-second app demo: phone in hand, screen unlocks showing the
"FlowTask" app interface, finger taps add new task, tasks rearrange
with smooth animation, completion ticks light up green, final shot
shows clean empty inbox. Energetic youthful male voiceover:
"All your tasks. Zero chaos." 9:16 vertical for Reels and TikTok

Veo 3.1 Prompt Template

[Format] — 9:16 / 1:1 / 16:9
[3-act structure] — hook, product, CTA
[Subject & action] — who/what is doing what
[Brand visual] — packaging, logo, text
[Camera move] — push-in / orbit / top-down / whip-pan
[Dialogue / voiceover] — line + tone
[Texture detail] — material, sheen, movement
[Music style] — background track

Top 5 Mistakes When Working with Veo 3.1

I've seen the same old mistakes on real client projects. Here are five to avoid.

Mistake 1. Your prompt's too long. Veo 3.1 doesn't like descriptions over 250 words. The model starts ignoring secondary parameters. Stick to 80-150 words in the main prompt.

Mistake 2. Russian dialogue without quotes. If you write 'character talks about product,' the model just makes up random text. Dialogue must be in quotes: says: "Exact dialogue text".

Mistake 3. 'Apple-like' logos. Doesn't work. Veo doesn't reproduce other brands. Describe your logo with words: "brand name 'X' in elegant serif font, gold colour".

Mistake 4. No format specified. Without '9:16 vertical,' the model gives you 16:9. Crop it for Reels, and you'll lose your composition. Always specify the format.

Mistake 5. Too many actions in 6 seconds. If your prompt says 'character enters cafe, orders coffee, sits at table, opens laptop,' the model squishes it all into one blurry frame. One act, one action.

What It Costs

Veo 3.1 in Quantium costs 50 credits for a 6-second video with audio in Standard, and 100 credits in Pro mode. With the Basic plan at 3000 credits, you get 60 ad videos a month. That's enough for a full SMM strategy for one brand. Voiceover comes with the video in one call; no separate TTS generation purchase needed.

For a comparison with Sora 2 and Kling, check out Sora 2 vs Veo 3.1 and Kling vs Veo. If your video is about art and cinematic style, see the Sora 2 Prompt Guide.

FAQs

How much does a Veo 3.1 video cost in Quantium?

Veo 3.1 is 50 credits for a 6-second video with audio in Standard, 100 in Pro. With the Basic plan at 3000 credits, that's 60 ad videos a month.

Can I add a brand logo to a Veo video?

Yes, Veo 3.1 renders text and simple logo shapes directly in the frame. For pixel-perfect logos, use image-to-video with a pre-made frame.

What format works for Instagram Reels and TikTok?

Specify '9:16 vertical' — Veo builds the composition with a central subject and empty space for captions. 6 seconds fits perfectly into the first retention window.

Does Veo 3.1 understand Russian for voiceovers?

Yes, Veo 3.1 synthesizes speech in 35+ languages, including Russian. Lip sync works correctly. Write the prompt in English, but put dialogue in the desired language within quotes.

How does Veo 3.1 differ from Sora 2?

Veo 3.1 is built for advertising: sync, text, brand visuals, 9:16. Sora 2 is stronger for cinematic and longer shots. For marketing, it's Veo; for art films, it's Sora.

Q
Quantium Editorial 30+ AI models in one Telegram bot

Try Veo 3.1 in Quantium

20 credits/month on the free plan. 30+ AI models in one Telegram bot.

Open bot →

Read Also