Why Do This in Your Browser?
Background removal used to mean a cloud GPU and a per-minute bill. The cloud workflow uploads your video, queues against shared infrastructure, runs segmentation on a server, and streams the result back — typically tens of MB up and tens of MB down for a single clip. For a 90-second product demo or a talking-head intro, that round trip is the whole job.
MediaPipe's Selfie Segmentation model is small enough (about 1.5MB) and fast enough (60–120ms per frame on a recent laptop) to run frame-by-frame in your browser. The tool reads each video frame to a canvas, runs MediaPipe locally to get a person-vs-background mask, composites the chosen replacement, then re-encodes with FFmpeg.wasm. Nothing leaves the device.
How It Works
Drop a video. The tool extracts frames via a HTMLVideoElement decoded at original framerate, feeds each one through MediaPipe Selfie Segmentation, and composites the segmentation mask against your chosen replacement background. Solid color exports as standard MP4. Image background scales to match the video aspect ratio. Transparent output writes a VP9 WebM with an alpha channel — playable in Chrome, Firefox, and most modern editors.
Segmentation quality is best on talking-head footage with the subject clearly separated from the background. Cluttered scenes, very fast motion, and subjects partially out of frame produce mask flicker. The tool exposes a 'mask smoothing' slider that temporally averages masks across 3–5 adjacent frames to reduce shimmer.
Tip: If your source clip needs trimming before background removal, run it through the Video Trimmer first — segmentation cost scales linearly with duration. For face-tracked vertical reframes, the Video Face Auto-Crop tool pairs naturally with this one.
Common Use Cases
How We Compare
Honest read on free, paid, and self-hosted options for this kind of job: