• Coding
  • Base64 padding... I'm confused.

Hey geeks,
I was messing around with Base64 and could understand the basics (ASCII to BINARY)
but padding confuses me, I read somewhere that we should add a = or == to make the characters' sequence a multiple of 4. What if i have only one character in the last 6-bits sequence, do I add ===? I am 100% sure that the === do not exist in Base64, so when we do use padding and how?
I'll appreciate any help or notes.
Wassim
You're right that there will never be a case where you will add three padding characters to your encoded text.

The encoding process is like this:

1. Take a string and break it up into sets of 3 bytes each.
2. For every set of three bytes, break it up into 4 numbers of 6 bits each.
3. The numbers are indexes into the base 64 lookup table, giving you your base 64 character.

The last set will either contain 3 bytes (no padding needed), or 2 bytes (we pad with one '='),
or 1 byte (we pad with 2 '=').

If it has 2 bytes, we add one 0 byte to the end. This lets us extract three numbers out of it: 6 bits + 6 bits + 6 bits (the final 4 bits + 2 bits from the 0 byte). We use the three numbers as indexes in the lookup table. Since bas64 takes three characters and outputs 4, we add a single '=' byte to our output.

If we have 1 byte in the final set, we add two 0 bytes to the end. We can extract two numbers: 6 bits + 6 bits (the final 2 bits + 4 bits from the neighbouring 0 byte). We add two '=' bytes to that our final set gives us 4 characters.

Let's try an example with the string "Hello".
H e l l o = 0x48 0x65 0x6C 0x6C 0x6F

01001000 01100101 01101100 01101100 01101111
We have 2 sets: one with 3 bytes, and one with 2 bytes. So we add one 0 byte to the end:
{01001000 01100101 01101100} {01101100 01101111 00000000}
Let's take the first set and produce our first 4 numbers:
01001000 01100101 01101100

010010000110010101101100

010010 000110 010101 101100

0x12 0x06 0x15 0x2C

18 6 21 44
Those last four numbers are the decimal values we got, we'll use them to index in to the lookup table.
But first, let's encode the second set:
01101100 01101111 00000000

011011000110111100000000

011011 000110 111100 000000

0x1B 0x06 0x3C

27 6 60
Notice how we skipped that last set of 6 bits? That's because the 0 byte isn't really part of the input text, we added it so that we can extract those three numbers. Since we need a way to indicate to whoever's decoding the text that we did this, we need to add a single padding byte.

Now let's use the base 64 lookup table to generate our encoded text:
{18 6 21 44} {27 6 60} =

{S G V s} {b G 8} =
Putting it all together:
SGVsbG8=
I hope that makes sense!
My technique is different i think.
1-Convet ASCII values to 8-bits binary
2- Join the binary sequence then break it into 6-bits binary you cann add zeros to the last one to fit 6-bits
3-Convert obtained binary to decimal, then from decimal to Base64 table characters.
But i still haven't got the idea, what shall i do if i have 5 characters in my encoded text, if i have 6 i will put ==, if 7 i will put = to fit 8 the second mutiple of four, what if i have a far number of characters like 5 or 13 i need 3 more bytes to fit the next multiple of four what can i do?