Coding a base64 Encoder in JavaScript

Base64 encoding is something you’ll come across almost every day in your development life, it’s everywhere. The tools that we use to do this are baked into our favourite programming languages so we never have to consider what’s going on under the hood. In this blog, I attempt to code my own base64 encoder in the hope of understanding it a little more and hopefully teaching you the reader something in the meantime.

I’m currently in the process of coding an Electron app using Vue3 which contains a collection of dev utilities — check out my previous blog for more details.

What is base64?

Base64 encoding is a data encoding mechanism that represents binary data in an ASCII string format by translating it into a radix-64 representation. The encoding comes with an overhead of 33%, and 37% in scenarios where linebreaks are present.

The quickest way to understand base64 encoding is to run through an example, let’s convert the string “abc”.

  1. Convert each character in the string to its decimal representation: a=97, b=98, c=99
  2. Convert each decimal to binary:  [01100001, 01100010, 01100011]
  3. Concatenate the binary 011000010110001001100011
  4. Split the binary up into sections of 6 bit length [011000, 010110, 001001, 100011]
  5. Convert binary to decimal [24, 22, 9, 35]
  6. Use the lookup table to find its base64 representation [Y, W, J, j]
    1. The lookup table here is just the character list: Base64 only contains A–Z , a–z , 0–9 , + , / and =

So you can see the string “abc” created the base64 value of “YWJj”, notice the 33% increase in size.

Coding a base64 encoder in JavaScript

In node if you want to encode a string to base64, you’d write something similar to the following:

Buffer.from("abc").toString("base64"); // YWJj

But that is of course, far too simple — let’s create our own.

Following the steps above:

Convert each character in the string to its decimal representation

  let decimalValues = "abc"
    .split("")
    .map((char) => char.charCodeAt(0).toString(10)); // ["97", "98", "99"]

So here I’m splitting the string to convert it to an array and then mapping each character to its decimal representation.

Convert each decimal to binary

  let binaryValues = decimalValues.map((decimal) => {
    let binary = (decimal >>> 0).toString(2);
    return "00000000".substring(binary.length) + binary;
  });  // ["01100001", "01100010", "01100011"]

Here I’m converting to an 8-bit binary value and then padding on any whitespace that JavaScript decides to strip.

Concatenate the binary and split it into sections of 6

binaryValues = binaryValues.join("").match(/.{1,6}/g); // ["011000", "010110", "001001", "100011"]

This pulls the binary array into a string and then splits it into a new array with a maximum of 6 bits per item.

Convert binary to decimal

let indexes = binaryValues.map(b => parseInt(b, 2)); // [24, 22, 9, 35]

Use the lookup table to find its base64 representation

let encodingTab =
  "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";

...

let out = indexes.map(e => encodingTab.charAt(e)).join(""); // YWJj

So given the index we retrieved from converting from the binary value to decimal, we can grab the base64 letter and plug it into our output.

Base64 padding

One quick way to tell if a string is base64 is if it ends in == or =, the example that I’ve shown above doesn’t end with any padding so there aren’t any equal signs. Because “abc” converts to 24 bits it is divisible by 6 which means it works perfectly without any modifications. But if we add just one extra letter then it’s a completely different story.

“abcd” converts to 32 bits, which isn’t divisible by 6. If we divide 32 by 6 then we have 2 bits left over: [000000, 000000, 000000, 000000, 000000, 00]

To combat this we need to add the addition zeroes to the end, so if there are 2 bits missing we add 2 0s if there are 4 bits missing we add 4 0s. So the resultant string would look like this:[000000, 000000, 000000, 000000, 000000, 000000]. If we add 2 bits then we append 1 “=” to the end, if we add 4 bits then we append 2 “=” to the end.

Let’s run through the example using “abcd”:

  1. Convert each character in the string to its decimal representation: a=97, b=98, c=99, d=100
  2. Convert each decimal to binary:   [01100001, 01100010, 01100011, 01100100]
  3. Concatenate the binary 01100001011000100110001101100100
  4. Split the binary up into sections of 6 bit length [011000, 010110, 001001, 100011, 011001, 00]
  5. Add the 0s the end last element in the array [011000, 010110, 001001, 100011, 011001, 000000]
  6. Convert binary to decimal [24, 22, 9, 35, 25, 0]
  7. Use the lookup table to find its base64 representation [Y, W, J, j, Z, A]
  8. Append the extra equals YWJjZA==

Here’s a simple way to add the padding in JavaScript:

  if (lastElement.length !== 6) {
    if (lastElement.length == 2) {
      padding += "==";
      binaryValues[binaryValues.length - 1] += '0000';
    } else {
      padding += "=";
      binaryValues[binaryValues.length - 1] += '00';
    }
  }

And that’s pretty much all there is to it, if you liked this blog then please sign up for my newsletter and join an awesome community!

2 comments

Leave a Reply