PNG vs JPG vs WebP: When to Use Each Image Format
Lossy vs lossless compression, when transparency matters, and why WebP is replacing both PNG and JPG on the web.
A photograph is a flat, two-dimensional grid of pixels. Yet, when you look at a photo of a mountain range or a city street, your brain instantly understands which objects are close enough to touch and which are miles away. This process of recovering the third dimension from a 2D image is called depth estimation.
A depth map is a specialized image where each pixel represents the distance from the camera to the object at that point. Unlike a standard photo that stores color (Red, Green, Blue), a depth map typically stores a single value per pixel, often visualized as a grayscale image.
Humans with two eyes use stereopsis to triangulate distance. But how does a single camera perceive depth? This is known as monocular depth estimation, and it relies on several visual cues:
Modern AI models, like Depth Anything, are trained on millions of images where the “ground truth” depth is known from LiDAR or stereo setups.
The model converts a color image into a depth map through a specialized pipeline:
The magic of modern AI is its ability to understand context. It knows that a person standing on a sidewalk is likely closer than the building behind them.
Once you have a depth map, you can manipulate a 2D photo as if it were a 3D scene:
Depth estimation is the bridge between the 2D world of images and the 3D world we inhabit.
Lossy vs lossless compression, when transparency matters, and why WebP is replacing both PNG and JPG on the web.
What happens when you drag the quality slider, how DCT transforms photos, and why screenshots compress differently.
How QR encodes data in binary modules, why error correction lets you put logos in the center, and what those corner squares do.