Understanding Color Formats: HEX, RGB & HSL
Why screens mix red, green, and blue light, what HEX shorthand really encodes, and when HSL makes your life easier.
You snap a photo of a receipt, and seconds later the text is on your screen, editable, searchable, and copy-pasteable. Behind that simple interaction is decades of research in Optical Character Recognition — the technology that teaches computers to read. How does a machine look at pixels and see letters?
OCR began in the 1950s when postal services needed to sort mail automatically. Early systems could only read specially designed fonts printed in magnetic ink (the blocky numbers on the bottom of checks are a relic of this era). By the 1990s, scanners and desktop OCR software made document digitisation practical. Today, OCR runs in real time on phone cameras, reading signs, menus, and license plates.
Modern OCR pipelines break the problem into four stages:
Image → Preprocess → Detect text regions → Segment characters → Recognise → Output text
│ │ │ │ │
│ Grayscale Bounding boxes Split or Neural net
│ Deskew around lines sequence + language
│ Denoise model modelTesseract is the most widely used open-source OCR engine. Originally developed by Hewlett-Packard in the 1980s, it was released as open source in 2005 and is now maintained by Google. Tesseract 5 uses an LSTM (Long Short-Term Memory) neural network for recognition, which dramatically improved accuracy over the older pattern-matching approach.
Tesseract supports over 100 languages and scripts, including Chinese, Arabic, and Devanagari. It can run in the browser via WebAssembly (through libraries like Tesseract.js), which means OCR can happen entirely on the client side without uploading images to a server.
| Challenge | Why it's hard | Mitigation |
|---|---|---|
| Handwriting | Infinite variation between writers | Specialised handwriting models (HTR) |
| Curved text | Characters distort along arcs | Text rectification preprocessing |
| Low contrast | Light text on light backgrounds | Adaptive thresholding, histogram equalisation |
| Non-Latin scripts | More glyphs, connected characters | Language-specific models |
| Complex layouts | Tables, columns, mixed content | Layout analysis before recognition |
OCR accuracy depends on the input quality far more than the engine:
OCR doesn't just read text — it bridges the physical and digital worlds. Every scanned form, photographed whiteboard, and translated sign relies on a machine that learned to see letters in pixels.
Why screens mix red, green, and blue light, what HEX shorthand really encodes, and when HSL makes your life easier.
Lossy vs lossless compression, when transparency matters, and why WebP is replacing both PNG and JPG on the web.
What happens when you drag the quality slider, how DCT transforms photos, and why screenshots compress differently.