How Video Containers and Codecs Actually Work
Why MP4 is a box and H.264 is the video inside, when remuxing is instant but re-encoding takes forever, and why GIF is terrible.
Slow motion looks simple from the outside: play the clip slower and you get a dramatic cinematic effect. In practice, that often produces stutter because the timeline stretches but the original frame count does not. If your source is 30 fps, every second only contains 30 unique moments in time.
To make slow motion feel smooth, software has to decide how to create the in-between moments. That decision is what separates three common strategies: frame duplication, frame blending, and optical-flow interpolation.
When you apply a slower playback speed, the clip duration increases. A 5-second shot at 30 fps has 150 frames. At half speed, that shot becomes 10 seconds. To maintain 30 fps output, the renderer now needs 300 displayed frames. It only has 150 originals, so it must invent the missing 150 somehow.
The yellow frames above are the ones the software has to generate. The strategy it uses determines whether the result looks choppy, ghosty, or cinematic.
Frame duplication keeps original frames intact and repeats them to fill time. In FFmpeg this pairs with setpts — you extend timing without adding motion detail.
The upside is speed and reliability: duplicated frames never hallucinate artifacts. The downside is visible stepping on fast motion, because each original moment lingers longer before the next unique frame appears.
Blending mixes adjacent frames to synthesize intermediates. Instead of showing frame A then frame B, it creates a weighted average between them. Motion appears less jumpy because transitions are softer.
But blending does not understand object boundaries. If an athlete moves across the frame, blending can leave transparent “double image” trails. The image looks smoother yet softer, with visible ghosting around edges.
Motion-compensated interpolation (FFmpeg's mi_mode=mci) estimates how pixels move between frames. Instead of averaging two images, it builds motion vectors and warps content forward and backward to synthesize a plausible in-between frame.
This produces the smoothest slow motion with sharp moving subjects, especially on high-shutter, well-lit footage. It is also the most expensive option and can fail on difficult scenes: occlusions, motion blur, smoke, and flashing lights can confuse vector estimation.
Great for rough cuts, screen recordings, and workflows where processing time matters more than perfect smoothness.
Useful when duplicate-frame judder is too obvious but full optical flow is overkill.
Ideal for hero moments where motion quality is part of the story and render time is acceptable.
Why MP4 is a box and H.264 is the video inside, when remuxing is instant but re-encoding takes forever, and why GIF is terrible.
Why screens mix red, green, and blue light, what HEX shorthand really encodes, and when HSL makes your life easier.
Lossy vs lossless compression, when transparency matters, and why WebP is replacing both PNG and JPG on the web.