AI May 14, 2026 · 4 min read

Depth Anything V2 Small runs in your browser — and it unlocks a lot of designer-side image work

The most quietly useful AI release of the year for designers wasn't a chatbot or an image generator — it was Depth Anything V2 Small, a depth-estimation model that takes a photograph and produces a grayscale map of how far each pixel sits from the camera. The Small variant ships under Apache 2.0 and weighs roughly 25 MB quantized, which means it now runs in any browser with WebGPU support in a couple of seconds per image. The original use case was robotics and computer vision; the designer-facing use cases are broader than the original authors probably expected.

⚡

AI Depth Estimator

Drop a photo, get a depth map. Runs locally via transformers.js.

→

UDT

UDT News Desk

Tool Spotlight

What a depth map actually gives you

A depth map is a single-channel grayscale image where pixel brightness encodes distance from the camera — white is near, black is far, or vice versa depending on convention. It is not a 3D model and it cannot show you what's behind the things in the foreground, but it answers a question that has historically required expensive sensors or hours of manual masking: which pixels in this image are part of the same depth plane?

For a designer working from photography, that one piece of information turns out to be a lever for a surprisingly long list of effects. The model is doing what previously required either a LiDAR-equipped phone, a stereo camera rig, or a patient designer with a Wacom and a fine-tip brush.

Five concrete designer uses

First, scroll parallax. Split a hero photograph into 3-4 depth bands using the map as a mask, and move each band at a different rate as the user scrolls. The result is a true parallax effect rather than the usual cheat of nudging a single image at one rate behind static foreground content. The CSS Scroll Animation Builder pairs naturally with this — load the depth-segmented layers and drive each layer's transform off the same scroll timeline at different speeds.

Second, displacement maps. Most design software accepts a grayscale image as a displacement source; feeding a depth map produces realistic warps that follow the actual geometry of the photograph rather than the eyeball-estimated geometry a hand-drawn mask would have. Useful for fabric overlays on people, water surface ripples on a landscape, or distortion effects pinned to the actual shape of an object.

Third, fake depth-of-field. Blur an image by a kernel whose radius is proportional to the depth-map value at each pixel and you get a credible tilt-shift or shallow-DOF effect from a photo shot at f/8. The result will not fool a photographer at full size, but it ships at thumbnail and hero sizes where the original photographer's choice of aperture was the only thing standing between you and the effect.

Fourth, composition analysis. A depth histogram of a photograph tells you whether the image has clear foreground/midground/background separation or whether it's flat. That is one of the cheapest objective signals available for "is this hero photo carrying its weight" — flat images tend to be flat hero shots. Run a batch of candidate hero images through depth estimation and pick the ones with three distinct depth modes.

Fifth, AR and 3D concepting. Even without a real 3D model, a depth map plus the original image is enough to build a 2.5D scene that can be rotated a few degrees in any direction. Tools like Looking Glass Studio and Three.js scenes both accept depth maps as input. For early-stage design exploration of an immersive concept, that is dramatically faster than building real 3D geometry.

The licensing wrinkle worth knowing

A short note on the model family. Depth Anything V2 ships in four sizes: Small, Base, Large, and Giant. The Small variant — the one in the UDT tool — is Apache 2.0, fully commercial-safe. The Base, Large, and Giant variants are licensed CC-BY-NC-4.0, which is non-commercial and rules them out for any production tool you intend to monetize, including ad-supported sites. The model card naming makes this easy to miss; if you're integrating Depth Anything elsewhere, pin the Small variant explicitly and verify the license tag on its specific model page, not on the general Depth Anything documentation.

Why now

Two years ago a depth model meant a Python toolchain, a CUDA GPU, and 4-8 GB of model weights. Today it means a 25 MB browser download and 1-3 seconds per image on a mid-range laptop. The work the Hugging Face transformers.js team is doing to port models to WebGPU-friendly ONNX is the reason — Depth Anything V2 Small was packaged for the browser within weeks of the original paper. For the first time, the gap between "research model published this week" and "free designer tool on the open web" is measured in days rather than years.

SOURCE Hugging Face: onnx-community/depth-anything-v2-small ↗ May 14, 2026

UDT

UDT News Desk

The UDT News Desk covers what's moving in design, frontend, and the tools designers and developers use. Edited and curated by the team at Ultimate Design Tools.