How PDF Works: Pages, Fonts, and Security
PDF internal structure, how fonts are embedded, why PDFs look the same everywhere, and how password encryption protects documents.
We've all been there: you receive a PDF, notice a tiny typo, and think, “I'll just quickly fix that.” But as soon as you try to click the text, everything breaks. The fonts change, the layout jumps around, or you find yourself unable to click anything at all. Why is a format so universal also so incredibly stubborn?
The biggest reason PDFs are hard to edit is that they weren't designed to be “documents” in the way Word or Google Docs are. A Word doc is like a bucket of liquid text that flows and refills as you type. A PDF is more like a digital photograph of a printed piece of paper.
When you save a file as a PDF, you are essentially “freezing” it. The goal of a PDF (Portable Document Format) is to look exactly the same on every screen, printer, and device in the world. To achieve that perfect consistency, it gives up the flexibility of easy editing.
In a normal document, your computer knows that a group of letters forms a word, and words form a paragraph. If you delete a word, the rest of the paragraph “reflows” to fill the gap.
PDFs don't have paragraphs. They don't even really have words. Instead, a PDF is a list of instructions that tells the computer exactly where to place every individual character using X and Y coordinates.
If you delete the “H” in a PDF editor, the “e” doesn't move over to take its place. It stays exactly at its assigned coordinate. This is why editing a PDF often feels like trying to move furniture in a room where everything is bolted to the floor.
Have you ever opened a PDF and seen weird symbols or boxes where letters should be? This usually happens because of fonts.
To make sure a document looks the same on your phone as it does on a billboard, PDFs “embed” the fonts they use. They carry a tiny copy of the font inside the file. However, when you try to edit that text, your PDF editor needs to have that exact same font installed on your computer to let you type new letters.
If you don't have the font, the editor will try to swap it for a “similar” one, which often ruins the layout or makes the text look slightly “off.”
“Editing a PDF is like trying to paint a new room onto a finished house using only the leftover paint from the original construction.”
Not all PDFs are created equal. There are two main types:
To edit a scanned PDF, you first need to run it through OCR (Optical Character Recognition) software, which “guesses” what the letters are. If the scan is blurry, the computer might guess wrong, which is why copying text from a scan sometimes results in gibberish.
When you use a PDF editor to change a word, it usually isn't actually changing the original file. Instead, it uses the “overlay” technique.
Imagine taking a physical piece of paper, putting a strip of white-out over a word, and then writing a new word on top of the white-out. That is exactly what most PDF editors do. The original text is often still there, hidden underneath a white box!
If you've ever tried to edit an academic paper or a math textbook, you know it's nearly impossible. Math symbols (like √, ∑, or π) aren't standard letters. They often use special fonts like Computer Modern or STIX.
These fonts encode symbols in a way that standard editors don't understand. When you try to save an edit, the math symbols often turn into empty squares (□) — a phenomenon developers call “tofu.” Because the editor doesn't know how to “draw” that symbol in the new version of the file, it just gives up.
PDFs are a masterpiece of consistency but a nightmare for flexibility. They were built to be the final destination for a document, not a stop along the way. If you need to make major changes, your best bet is always to find the original Word or Google Doc file rather than fighting the frozen instructions of a PDF.
PDF internal structure, how fonts are embedded, why PDFs look the same everywhere, and how password encryption protects documents.
Why screens mix red, green, and blue light, what HEX shorthand really encodes, and when HSL makes your life easier.
Lossy vs lossless compression, when transparency matters, and why WebP is replacing both PNG and JPG on the web.