TC
text6 min read

What is Base64 and Why Do We Use It?

Have you ever pasted an image directly into a CSS file? Or looked at a JWT token and wondered what that long string of letters was? Both use Base64 — an encoding scheme that converts binary data into plain text. It's one of the most widely used encoding formats on the internet, yet most developers use it without knowing how it actually works.


The problem Base64 solves

Computers store everything as binary — sequences of 1s and 0s. A photo, a PDF, an encryption key: they're all just bytes. But many communication channels were designed for text only. Email (SMTP) was built for 7-bit ASCII. JSON can't contain raw binary. HTML attributes expect text strings. If you tried to embed a raw JPEG in an email, the binary bytes would get corrupted by text-processing systems that strip high bits or interpret control characters.

Base64 solves this by translating binary data into a set of 64 “safe” characters that survive any text-based channel intact.

The 64-character alphabet

Base64 uses exactly 64 characters to represent data:

A-Z  (26 characters)  → values 0–25
a-z  (26 characters)  → values 26–51
0-9  (10 characters)  → values 52–61
+    (1 character)     → value 62
/    (1 character)     → value 63

=    (padding character, not part of the 64)

Every character in this alphabet is safe in ASCII, safe in URLs (with minor variants), and safe in email headers. That's the entire point — no special characters, no control bytes, nothing a text system would misinterpret.

How encoding works, step by step

Base64 takes every 3 bytes (24 bits) of input and splits them into 4 groups of 6 bits. Each 6-bit group maps to one of the 64 characters. Since 6 bits can represent 0–63, the mapping is exact.

Input text:     "Hi"
ASCII bytes:    72, 105
Binary:         01001000  01101001

Split into 6-bit groups:
  010010  000110  1001xx

Pad remaining bits with zeros:
  010010  000110  100100

Look up each value in the Base64 alphabet:
  18 → S    6 → G    36 → k

Add padding (input was 2 bytes, not 3):
  Result: "SGk="
Base64 is encoding, not encryption. It provides zero security. Anyone can decode a Base64 string instantly. Never use Base64 to “hide” passwords, tokens, or sensitive data. It's a transport format, not a protection mechanism.

Why the output is ~33% larger

Three input bytes become four output characters. That's a 4/3 ratio, meaning Base64 output is always about 33% larger than the original binary. A 1 MB image becomes roughly 1.33 MB when Base64-encoded. This is the cost of text-safety — you trade size for compatibility.

The padding problem

When the input length isn't divisible by 3, Base64 adds = padding characters to make the output length a multiple of 4. One leftover byte produces ==, two leftover bytes produce =. Some modern implementations (like Base64url) drop the padding entirely, since the decoder can infer it from the output length.


Where Base64 is used

Data URIs in HTML and CSS

Instead of linking to an external image file, you can embed it directly in your markup:

<img src="data:image/png;base64,iVBORw0KGgo..." />

/* Or in CSS */
background-image: url(data:image/svg+xml;base64,PHN2Zy...);

This eliminates an HTTP request but increases the HTML/CSS file size. It's worth it for tiny icons and SVGs but counterproductive for large images.

Email attachments (MIME)

When you attach a file to an email, your mail client encodes it as Base64 and embeds it in the message body using MIME (Multipurpose Internet Mail Extensions). The receiving client decodes it back to the original file. This is why forwarding large attachments bloats email size.

JSON Web Tokens (JWTs)

JWTs consist of three Base64url-encoded segments separated by dots. The header and payload are just JSON objects encoded in Base64url so they can travel safely in HTTP headers and URLs.

eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyIjoiYWxpY2UifQ.signature
│                       │                          │
└─ header (Base64url)   └─ payload (Base64url)     └─ signature

Base64 vs Base64url

Standard Base64 uses + and /, which are special characters in URLs. Base64url swaps them for - and _ to make the output URL-safe without percent-encoding. JWTs and many modern APIs use Base64url by default.

  • Standard Base64: uses +, /, and = padding
  • Base64url: uses -, _, and often omits padding
Base64 is the duct tape of the internet — it was never meant to be elegant, but it holds everything together when binary data needs to travel through text-only channels.

Try it yourself

Put what you learned into practice with our Base64 Encode/Decode.