AI Depth Estimator
Generate a grayscale depth map from any image. Useful for subject masking, parallax composites, displacement maps, and layered designs. Depth Anything V2 Small runs entirely in your browser at 27 MB.
Why an AI Tool That Runs In Your Browser
Depth maps are one of those quietly useful outputs that designers reach for more than they expect to — once you have one you can mask the foreground, blur the background by depth, build a parallax composite, drive a displacement filter, or feed it into a 3D layered scene. The catch is that monocular depth estimation has historically required either an expensive proprietary service or a half-gigabyte model. Depth Anything V2 Small breaks that. It is the smallest model in the Depth Anything V2 family — about 27 MB int8 — and produces relative depth maps that are good enough for most design uses. Importantly, only the Small variant of Depth Anything V2 is released under Apache 2.0. The Base, Large, and Giant variants are CC-BY-NC-4.0 (non-commercial). UDT runs ads, so the commercial test means we ship Small only. Depth Anything V2 Small was developed by Lihe Yang and collaborators and is hosted on Hugging Face by the depth-anything organization. The ONNX weights are mirrored by onnx-community.
How AI Depth Estimator Works
Click Load model on first visit. The browser downloads the quantized ONNX weights (about 27 MB) from the Hugging Face CDN and caches them in IndexedDB. Drop or pick an image from your device. The model runs through the transformers.js depth-estimation pipeline, which returns a normalized depth tensor at the model native resolution, then resizes it to match the input dimensions. The depth map is rendered as a grayscale PNG where white is near and black is far — the standard convention for depth maps used in compositing and 3D software. A side-by-side preview shows the original image and the depth map together. A download button saves the depth map as a PNG; the original is untouched. The map can be dropped directly into a layered composite as a mask, fed into a displacement filter in your image editor, or used as a guide for parallax work. Depth Anything V2 produces relative depth, not metric depth — the values represent ordering (near vs far) rather than absolute distances. For nearly all design uses, relative depth is what you want.
Frequently Asked Questions
What is the download size for Depth Anything V2 Small?+
Depth Anything V2 Small is approximately 27 MB at int8 quantization, the smallest model across the AI Suite. The browser caches it in IndexedDB after the first download, so later visits load almost instantly.
Will the input image be uploaded to a server?+
No. After the model finishes downloading on first use, every depth estimation runs entirely in your browser. The image stays on your device and is never sent to our servers or to any third-party API.
What is depth estimation useful for in design work?+
The most common uses are subject masking (separating foreground from background by depth threshold), depth-of-field simulation (blurring background pixels), parallax composites (layered movement between near and far elements), displacement maps for 3D-style effects, and feeding the depth into ControlNet for diffusion-guided edits.
Which depth model and license does this tool use?+
The tool uses Depth Anything V2 Small from the depth-anything organization on Hugging Face. The Small variant is released under the Apache 2.0 license, which permits commercial use. Note that the Base, Large, and Giant variants of Depth Anything V2 are CC-BY-NC-4.0 (non-commercial) and not used by this tool.
Is the depth map metric or relative?+
Relative. Depth Anything V2 Small produces relative depth values — pixel values represent ordering from nearest to farthest, not absolute distances in meters. For nearly all design and compositing uses this is what you want. For absolute distance measurement (robotics, autonomous driving) a metric depth model is required.
Why is the output grayscale?+
The standard depth map convention is grayscale: white means near, black means far. This format drops directly into image editors as a mask, into 3D software as a displacement texture, and into ControlNet without any color-channel manipulation. A colorized visualization preview is also rendered alongside the grayscale output for at-a-glance review.
What image sizes work well?+
The model resizes inputs to a standard working size internally and then upsamples the depth map to match the original image dimensions. Any common input size works. Very wide or very tall images sometimes produce slightly blurrier depth at the edges, which is a known limitation of monocular depth estimation.
Why does the depth look wrong on flat or textureless areas?+
Monocular depth estimation relies on visual cues (perspective, occlusion, focus, lighting). Flat textureless surfaces — clean walls, blank sky, plain backgrounds — have few cues, so the model can produce confident-looking but unreliable depth there. The output is most accurate on images with clear depth cues like overlapping objects, perspective lines, or visible texture.
Built by Derek Giordano · Part of Ultimate Design Tools
Privacy Policy · Terms of Service