Sora 2 can do cinematic shots with physics and synced audio, but if you just throw in «a man walking in the city, cinematic,» you'll get surrealism and a floating perspective. The model picks up details: the more precisely you describe a shot, the less it has to guess. There's a huge difference between «portrait, photorealistic» and an 80-word description detailing the lens, lighting, and atmosphere. The first gives you a stock image; the second delivers a reference-quality shot.

Below are 10 techniques I've been running on Sora 2 for the past two months using Quantium's video engine. Each one's a single parameter that really changes the outcome. From lighting and camera movement to negative prompts. With prompt templates and what you'll get. All techniques were tested on real-world tasks: indie game trailers, background video for landing pages, concept scenes for art projects, and music videos.

Before we dive into the techniques, one principle. Sora 2 operates on the «more detail, less guesswork» principle. The model's seen millions of shots with EXIF data, director's notes, and DP instructions. It distinguishes «35mm anamorphic» from «50mm spherical,» understands the difference between «golden hour» and «blue hour,» and recognizes specific DOP references like Roger Deakins, Emmanuel Lubezki, and Greig Fraser. Use this vocabulary and write your prompts in English: the model was trained on English descriptions, and it converts Russian terms with loss.

Composition and Physics: Techniques 1-2

1. Shot Composition

Sora 2 understands classic film terminology. Be explicit about your shot: extreme close-up, medium shot, wide establishing shot, over-the-shoulder. Otherwise, the model defaults to a medium shot and cuts off character's hands at the frame's edge.

Wide establishing shot of a rain-soaked Tokyo street at dusk,
neon signs reflecting on wet asphalt, low camera angle
Result: a wide urban scene with depth, reflections work, no awkward cropping

2. Physics and Materials

Sora 2's main strength is simulating physical processes. The more explicitly you ask it to render physics, the cleaner the result. Specify: liquid splash, cloth folds, smoke dissipation, glass shatter, particle dispersion. The model accurately calculates material behavior.

A coffee cup spills across a wooden table in slow motion,
liquid forms thin sheets, droplets bounce, cloth napkin absorbs
the edge of the spill, realistic surface tension
Result: realistic splash with correct liquid distribution, not cartoonish

Lighting: Technique 3

3. Three-Point Lighting Scheme

A cinematic shot relies on light, and Sora 2 can set up classic three-point lighting. Specify all three sources: key light, fill light, and rim light. It's not magic—the model's seen millions of shots with proper lighting setups.

Portrait of a violinist on stage, key light from upper-left at 45 degrees,
soft fill light from camera-right, rim light behind the head separating
subject from dark background, slight haze in beam paths
Result: a volumetric portrait separated from the background, the shot looks like it was filmed in a large studio

Camera Movement: Technique 4

4. Specific Movement Type

Sora 2 understands the difference between movement types. Don't write «camera moves»—that's a placebo. Instead, use: slow dolly in, tracking shot following subject, orbit 180 degrees around subject, handheld with subtle shake, crane up revealing landscape. The model distinguishes all these movements and keeps them in frame.

Slow dolly in on a chess board, pieces in dramatic side lighting,
camera moves from wide to extreme close-up over 5 seconds,
shallow depth of field shifting focus to the king piece
Result: smooth dolly-in with correct focus shift, no jerks or lost composition

Atmosphere: Technique 5

5. Air Between Planes

Atmospheric effects add more cinematic quality than any LUT. Specify: volumetric fog, morning haze, dust particles in light beams, steam rising from streets, light rain. Just one of these elements turns a flat AI render into a deep, layered shot.

A lone figure walks through a forest at dawn, volumetric god rays
between trees, morning haze filling the middle ground, dust particles
visible in light beams, soft atmospheric depth
Result: classic «forest in rays,» volumetric air, realistic depth of field

Color Palette: Technique 6

6. Color Through Film References

Sora 2 knows film and director references—it's the quickest way to set a palette with a single line. «In the color palette of Blade Runner 2049» gives you an orange-blue-green tonality. «Wes Anderson symmetrical pastels» delivers pastel, centered shots. «Roger Deakins lighting» provides soft, golden natural light.

An old man reads a letter by the window, in the color palette of
"The Revenant" — desaturated cold tones with warm interior accents,
natural light only, Roger Deakins-style lighting
Result: a distinct reference look without needing to describe each color manually

Audio Synchronization: Technique 7

7. Describe Sound with Action

Sora 2 synthesizes audio during the same generation as video—but only if you ask. Specify: diegetic sound, ambient noise, character speaking «dialogue», background music: cinematic strings, slow tempo. You can write character dialogue in any language; the model will voice it, synced with their lips.

A barista pours steamed milk into a cup, ambient cafe noise,
soft jazz playing quietly in background, the barista says
"Your latte is ready" with a gentle smile, sound of milk frothing
synchronized with the pour
Result: video with realistic sound design, lip sync, and object sounds

Duration and Tempo: Technique 8

8. Specify Shot Tempo

Sora 2 in Quantium generates 5-second videos. Without a specified tempo, the model defaults to «normal»—all movements happen in 2 seconds, leaving the third empty. Specify: slow motion 0.25x for impactful shots, real-time for conversations, time lapse for nature and clouds, two-act beat: setup then reveal for dramatic pacing.

5-second shot with two-act beat: first 2 seconds show closed gift box
on a table in soft light, then in the last 3 seconds hands enter
the frame and slowly open the box, anticipation building
Result: a video with internal drama, not just a static scene with one mechanic

Style Priming: Technique 9

9. Specify Format and Medium

One prefix changes everything. «Cinematic» gives you the standard film look. «Shot on 35mm film» adds grain and warm tones. «16mm documentary» creates a reportage shot. «Anamorphic lens with horizontal flares» delivers widescreen cinematography with characteristic streaks. «VHS-style 1990s home video» stylizes it as retro.

Shot on 35mm film, anamorphic lens, slight horizontal lens flares,
a vintage car driving down a coastal highway at golden hour,
desaturated blue sky, warm shadows, cinematic widescreen 2.39:1
Result: a consistent film style that holds for all 5 seconds

Negative Prompts: Technique 10

10. What Shouldn't Be There

Sora 2 accepts negative prompts via an «avoid:» or «no:» block at the end of your prompt. This removes typical artifacts: extra fingers, distorted faces, watermarks, floating objects. Keep your list short, 4-6 items max—the model starts ignoring longer negative blocks. For a detailed syntax breakdown and 20 ready-to-use blocks, check out our separate article on negative prompts.

... main prompt ...
Avoid: distorted faces, extra limbs, warped hands, floating objects,
text artifacts, oversaturated colors, jittery camera
Result: clean shots without classic AI bugs, especially in close-ups

Ready-Made Prompt Template

When you're too lazy to build from scratch—here's a formula that works for any Sora 2 scene:

[Shot type] — wide / medium / close-up / extreme close-up
[Subject and action] — who and what they're doing
[Camera movement] — slow dolly in / orbit / handheld / static
[Lighting] — key + fill + rim, color temperature
[Atmosphere] — fog / haze / dust / rain
[Color palette] — film reference or specific look
[Audio] — ambient / dialogue / music
[Style prefix] — cinematic / 35mm / anamorphic
[Negative] — avoid: ...

Workflow: How I Build a Prompt in 3 Minutes

For real client videos, I don't write prompts from scratch. I've got a fixed pipeline that takes 3 minutes and delivers consistent results.

Step 1. Reference. I open Pinterest or Vimeo, find a shot that matches what I'm going for. I download it — that's my visual anchor.

Step 2. Decipher. I describe that shot in text, using the 9 parameters from the template above: composition, physics, light, camera, atmosphere, color, audio, tempo, style. That's 80% of the work.

Step 3. Negative. I add 4-6 items for potential artifacts specific to that scene. If there's a face in the shot, I add "no distorted faces." If it's hands, "no extra fingers." If text, "no garbled text."

Step 4. Image-to-video. I upload the reference shot as a starter. Sora 2 accepts image-to-video, which drastically improves style consistency. The text prompt then just describes motion and audio.

Step 5. Iterate. The first generation is almost always close, but rarely perfect. I change one parameter at a time: first camera movement, then light, then duration. By the third try, I usually nail the shot.

What It Costs

Sora 2 in Quantium runs 60 credits for a 5-second video in Standard, and 120 credits in Pro mode with priority queuing. On the Basic plan, with 3000 credits a month, that's 50 videos — more than one a day. If you're doing client videos, it makes sense to start with Standard to refine your prompt, then switch to Pro only for the final shot.

Audio and video generate in one call — no need to buy separate voiceovers. That seriously changes the economics against a "Veo + ElevenLabs" setup. For short ad videos with lip sync and product text, you'll want to switch to Veo 3.1; it's specifically tuned for marketing. Get more details on the comparison in Sora 2 vs Veo 3.1.

Related materials: image-to-video from an existing picture, Kling vs Veo, work gallery.

FAQ

How much does one Sora 2 generation cost in Quantium?

Sora 2 is 60 credits for a 5-second video in Standard, 120 in Pro mode. On the Basic plan with 3000 credits, that's 50 videos a month. Audio comes in one call with the video.

Does Sora 2 support Russian for voiceovers?

Yes, Sora 2 synthesizes speech in 30+ languages, including Russian. Write the scene prompt itself in English, but character lines go in the desired language within quotation marks.

What's the difference between Sora 2 and Veo 3.1?

Sora 2 is stronger for cinematic scenes with physics and 8+ second durations. Veo 3.1 is more precise for short ad videos with lip sync and product text.

Can I upload a reference image to Sora 2?

Yes, image-to-video in Sora 2 takes one frame and brings it to life with a text prompt. It's the most stable way to lock in style and character.

What if Sora generates artifacts on faces?

Use negative prompts like "no distorted faces, no warped features." Reduce duration to 4 seconds, change close-up to a medium shot — that usually gets rid of 90% of artifacts.

Q
Quantium Editorial 30+ AI models in one Telegram bot

Try Sora 2 in Quantium

20 credits a month on the free plan. 30+ AI models in one Telegram bot.

Open bot →

Read Also